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Abstract — Cognitive radio metliodologies liave the potential 
to dramatically increase the throughput of wireless systems. 
Herein, control strategies which enable the superposition in time 
and frequency of primary and secondary user transmissions 
are explored in contrast to more traditional sensing approaches 
which only allow the secondary user to transmit when the 
primary user is idle. In this work, the optimal transmission 
policy for the secondary user when the primary user adopts 
a retransmission based error control scheme is investigated. The 
policy aims to maximize the secondary users' throughput, with 
a constraint on the throughput loss and failure probability of 
the primary user. Due to the constraint, the optimal policy 
is randomized, and determines how often the secondary user 
transmits according to the retransmission state of the packet 
being served by the primary user. The resulting optimal strategy 
of the secondary user is proven to have a unique structure. In 
particular, the optimal throughput is achieved by the secondary 
user by concentrating its transmission, and thus its interference 
to the primary user, in the first transmissions of a primary user 
packet. The rather simple framework considered in this paper 
highlights two fundamental aspects of cognitive networks that 
have not been covered so far: (i) the networking mechanisms 
implemented by the primary users (error control by means 
of retransmissions in the considered model) react to secondary 
users' activity; (ii) if networking mechanisms are considered, 
then their state must be taken into account when optimizing 
secondary users' strategy, i.e., a strategy based on a binary 
active/idle perception of the primary users' state is suboptimal. 

Index Terms — Automatic retransmission request (ARQ), cog- 
nitive radios, Markov processes, reactive primary users, wireless 
networks. 



I. Introduction 

Cognitive radio has been the subject of intense research 
of late, e.g., lUl-S, due to its potential to increase the 
efficiency of wireless networks. Unlicensed secondary users 
adapt their operations around those of the primary users and 
the surrounding network environment to opportunistically ex- 
ploit available resources while limiting their interference with 
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licensed primary users. Most prior work fr|-f4'| focuses on 
a white space approach, where the secondary users sense the 
channel in order to detect time/frequency slots left unused by 
the primary users and exploit them for transmission. Pure white 
space approaches are based on a zero-interference rationale, 
i.e., the objective of the secondary user is to not interfere 
at all with the primary user. However, sensing errors may 
lead to unwanted collisions, thus degrading the throughput 
achieved by the latter. Typically, primary users are modeled 
via a fixed Markov chain tracking the idle-busy channel state, 
irrespective of the operations of the secondary users, according 
to the general assumption that primary users are dumb and 
non-adaptive devices. 

However, such a model may not always be accurate. For 
instance, a collision may force a primary user to schedule a 
retransmission and enter a backoff period. As a consequence, 
a collision may modify the arrival rate of the packets at 
the primary destination, while changing the characterization 
of the generated traffic (burstiness of idle/busy slots). As a 
consequence, the interaction between the primary users and 
the secondary users must be considered when analyzing the 
network. Additionally, the use of signal processing methods 
(multiuser detection and multiple-input multiple-output sys- 
tems) enables the superposition of secondary transmissions 
over a primary transmission while achieving accurate decoding 
of the primary user packet. 

There exists some prior literature investigating the coex- 
istence in the same time/frequency band of primary and 
secondary users with a focus on physical layer methods for 
static scenarios fSj-fSl, ifTOl . ifTTl . A thorough discussion 
of spectrum sharing under performance constraints from an 
information theoretic perspective can be found in |12|. Those 
approaches, though valuable in some broadcasting network 
scenarios, do not characterize the dynamic interaction between 
the two classes of users. In contrast, our prior work, which 
inspires the current paper, studies concurrent transmission by 
secondary and primary users in a highly dynamic environ- 
ment ifTSll . We explicitly consider an interference mitigation 
scenario, where the secondary user is allowed to transmit 
concurrently to the primary user, with a constraint on the 
performance loss suffered by the latter, in terms of either a 
reduced throughput or an increased failure probability. 

In this work, we study access control policies for sec- 
ondary users in wireless networks where nodes implement a 
retransmission-based error control scheme. Some prior liter- 
ature has investigated networks of primary users implement- 
ing Automatic Retransmission reQuest (ARQ). In |14|, the 
secondary user exploits the retransmissions of primary user 



packets in order to achieve a higher transmission rate. In 
fact, the secondary receiver can potentially decode the primary 
user's packet in the first transmission and then opportunisti- 
cally cancel interference ifTTl in the following retransmissions. 
However, the framework in fT^I does not consider the dy- 
namics of the network and the bias in the channel availability 
generated by interference. In ||15||, Eswaran et al propose a 
framework where the secondary user exploits ARQ feedback 
to estimate the throughput loss of the primary user and tune 
the transmission policy accordingly, by using information 
theoretic results. In fTSI, Zhang proposed a learning algorithm 
for a scenario in which the primary user adapts the transmitted 
power in response to interference. 

The contribution of the present paper is to introduce the 
reactive primary user scenario, where the activity of the 
secondary users biases the temporal evolution of the stochastic 
process tracking the state of the primary users. A Markov 
model is proposed and the optimization problem is formulated 
as a constrained Markov Decision Process (MDP). The struc- 
ture of the optimal policy is derived analytically for a specific 
case. We focus on a network with two mutually interfering 
links, one primary and one secondary. In the framework 
considered, a packet may be retransmitted a finite number of 
times, due to transmission failure, before being discarded by 
the transmitter. Packet arrivals at the primary user are modeled 
with a fixed probability that an empty slot, i.e., a slot in which 
a retransmission is not scheduled, is accessed for transmission. 

We study the interference that the secondary user causes 
to the primary user and how this interference impacts the 
retransmission process of the latter. We explicitly consider 
an interference mitigation scenario, where the secondary user 
is allowed to transmit concurrently to the primary user, with 
a constraint on the performance loss suffered by the latter, 
in terms of either a reduced throughput or an increased 
failure probability. Our analysis is based on a detailed Markov 
model of the network, accounting for the distortion of the 
retransmission process caused by secondary user transmis- 
sions. Remarkably, this simple model captures a fundamental 
aspect of cognitive networks: the control mechanisms im- 
plemented by the primary users react to the activity of the 
secondary users. An accurate stochastic model for the activity 
of the primary users should include this effect. In the model 
considered herein, interference from the secondary source 
increases the probability that primary source's transmissions 
fail. As a consequence, retransmissions are triggered more 
often, and the stochastic characterization of primary source's 
channel occupation changes. Moreover, as the motion law of 
the state of the primary user depends on both the action of 
the secondary user and the current state of the primary user 
(the retransmission index of the packet being served in the 
considered model), then the secondary users' strategy should 
be based on the state of the primary user This means that 
a binary active/idle representation of the state of the primary 
users leads to suboptimal policies. 

In this framework, interference due to the activity of the 
secondary user not only reduces the instantaneous average 
revenue collected by the primary user in each state of the 
Markov chain modeling the network, but also changes the 



transition probabilities, and thus the steady-state distribution, 
of the chain. 

The optimization problem can be formalized through a 
Linear Program. Due to the constraint on the maximum perfor- 
mance loss of the primary user, the solution is a randomized 
policy, i.e., the optimal policy assigns a probability to each ac- 
tion in the action set given the state of the underlying Markov 
process. As we focus on a binary transmission/idleness action 
set, the randomized policy simply determines how often the 
secondary user transmits given the state of the network. 

This problem, though conceptually simple, unveils im- 
portant issues and general behaviors. As the primary user 
implements a retransmission-based error control mechanism, 
the activity of the secondary user biases the retransmission 
process via interference. Interference at the primary receiver 
increases the failure probability of primary user's transmis- 
sions. Therefore, due to the activity of the secondary user 
the average number of transmissions of a primary user's 
packet gets larger, together with the average time required to 
return to primary user's idle state. Interestingly, the increase 
of the average number of transmissions of primary user's 
packets depends on the index of the interfered transmission. 
For instance, while interference from the secondary user in 
the first transmission of the primary user's packets potentially 
leads to a significant increase of the number of transmissions 
per packet, transmission by the secondary user in the last 
allowed transmission of primary user's packets does not in- 
crease the average number of transmissions at all. Thus, as 
observed before, the impact of secondary users' transmission 
in the various states critically depends on the state of the 
primary network. On the other hand, first transmissions occur 
more frequently than last transmissions, and, thus, the overall 
throughput collected by the secondary user as a function 
of the strategy greatly depends on the states in which it 
concentrates its transmissions. The interplay between the cost 
of the primary user and the reward of the secondary one due 
to the modifications of the steady-state distribution determines 
the optimal strategy. 

An important observation concerns the availability of time 
slots in which the primary user is idle, i.e., the white spaces. 
As the primary user implements a retransmission-based error 
control mechanism, failed decoding at the primary receiver 
triggers a further transmission of the packet, until the maxi- 
mum number of transmissions per packet is reached. There- 
fore, the interference generated by the activity of the secondary 
user increases the fraction of channel resource occupied by the 
primary user This means that the availability of white spaces 
decreases as the activity of the secondary user increases. This 
is an additional reason for carefully designing the strategy of 
the secondary user 

If transmission by the primary user does not affect reception 
at the secondary receiver, and either throughput or packet 
failure probability is considered as the metric for the primary 
user performance, the optimal transmission strategy of the 
secondary user is shown to have a unique structure. The 
throughput-optimal strategy concentrates transmissions by the 
secondary user in the region of the state space corresponding 
to the first transmissions of a primary user's packet. According 




Figure 1. Considered network. Direct links and interfering links are 
represented by solid and dashed arrows, respectively. 

to such optimal policy, the secondary user transmits with 
probability 1 up to the A^i-th transmission of a primary user's 
packet, with probability in [0, 1] in time slots in which the 
primary user is performing the A^i-th transmission of a packet 
and with probability otherwise. The boundary state and the 
associated transmission probability are determined to result 
into a bounded reduction of the time-average performance of 
the primary user This result also provides a simple algorithm 
to solve the linear program resulting from the constrained 
optimization problem. 

We also observe that the maximum aggressiveness of the 
secondary user depends on the arrival rate at the primary user. 
In fact, when the primary source spends most of its time idle, 
a longer retransmission process has a less deleterious effect 
on throughput. 

The rest of the paper is organized as follows. Section HI] 
describes the network scenario considered throughout the 
paper. Section |lll]defines the optimization problem and derives 
the Markov model of the network. In Section |IV] the structure 
of the optimal strategy for the case in which primary users' 
transmission does not affect packet reception at the secondary 
receiver is derived. Section |V] discusses the optimal transmis- 
sion policy for the general case. In Section IVIII numerical 
results highlighting the fundamental issues and behaviors 
described in the previous sections are shown. Section IVIIII 
concludes the paper. 

II. Network Description 

Consider the network in Fig. [T] with a primary and a 
secondary source, namely Sp and Ss- The primary source 
Sp and the secondary source Ss transmit packets to their 
respective destinations, namely Dp and Dg. 

The reception of a packet at a particular destination is 
interfered with by the transmission of the other source. Our 
model subsumes the white space approach which typically 
assumes that a collision results in a decoding failure. An 
alternative view is that the collision approach implies a sec- 
ondary access policy that will result in no throughput loss 
for the primary user. In contrast, we assign decoding error 
probabilities to the primary and secondary destinations as an 
abstraction of various interference mitigation methods. This in 
turn will result in some throughput loss and increase of the 
packet failure probability as a function of the access strategy 
of the secondary user. It is trivial to show that the white space 
approach is optimal for the constraint of no collisions. 



We assume a quasi-static channel model, where time is 
divided into slots of fixed duration and the channel gain of a 
certain link remains constant within a slot, and is independent 
of the channel gains in the other slots. We denote by gpp, 
9ps, 9SS and gsp, the random variables corresponding to the 
channel coefficients respectively between Sp and Dp, Sp and 
Ds, Ss and Ds and Ss and Dp, and with Cpp(.9)^ Cpsig), 
Css (.9) and Csp (g) their respective probability density function. 

Assuming that the transmission of a packet fits a slot, 
the performance of the receiver can be modeled via the 
average decoding failure probability, that depends on the 
packet encoding, transmission rates, structure of the receiver 
and average channel gains, as well as the activity of the 
concurrent source. The average decoding failure probability at 
the primary destination Dp associated with a silent secondary 
source is denoted by p>0, while the same probability when the 
secondary source transmits is p*>p- Analogously, the average 
decoding failure probability at the secondary destination Ds 
when the primary source is silent and transmitting is denoted 
with i/>0 and iy*>i^, respectively. 

The construction fits many models and assumptions on 
the architecture of the physical layer and the transmission 
protocols. For instance, one may assume that the primary 
destination performs signal decoding unaware of the presence 
of the secondary source, and thus treats its signal as noise, 
whereas the secondary receiver adopts a smarter decoding 
strategy, by either treating as noise or decoding and canceling 
the signal from the primary source according to the transmis- 
sion rates, powers and channel coefficients [17|. 

Denoting the transmission rate and power of the primary 
and secondary sourcse with Rp, Pp, Rs, PsQ 

respectively, 

we obtain the following failure probabilities for the primary 
link 

p^r{Rp>C{gppPp)} (1) 

where C(a;)= log(l+x). 

For the secondary link we obtain 

iy = P{Rs>CigssPs)} (3) 
iy*=r{{Rp,Rs}m, (4) 
where ^ is the set of all the rate pairs {Rp, Rs} such that 

Rs<C{gssPs) (5) 

Rp + Rs<C{gppPp+gssPs), (6) 

or 

Rs<c(-p%-), (7) 

where Eqs. Q and (|6]l refer to the achievable rate region cor- 
responding to the secondary receiver performing interference 
cancellation, while Eq. (|7]| refers to the case in which the signal 
from the primary source is treated as noise by the secondary 
receiver. The failure probabilities listed above admit a simple 
integral form and can be easily computed. 

' In the following example, rates Rp and i?g are expressed in [bit/s/Hz] 
and the transmission powers Pp and Ps ai'e normalized to the noise power. 




We remark that we do not consider a specific physical 
layer architecture or transmission technique, but rather, we 
refer to the simple construction based on the average decoding 
probabilities described before. 

In order to improve reliability, the primary source imple- 
ments a retransmission-based error control scheme, by which 
a failed packet is retransmitted in the subsequent slot. We 
consider a finite-retransmission process, where each packet 
can be transmitted at most T times, see Fig. |2(a)| Delayed 
retransmissions do not alter the following discussion. If the 
packet has been transmitted T times, it is discarded by the 
primary source. It is assumed that the destination sends an 
acknowledgment packet after each received packet, in order 
to make the source aware of the outcome of the transmission. 
Note that this scheme can be classified either as an automatic 
retransmission request (ARQ) or a type-I hybrid ARQ scheme 
depending on whether or not the packets are encoded before 
transmission. For the sake of simplicity, the secondary source 
is assumed to transmit each packet only once. This assumption 
is consistent with the common characterization of secondary 
users as opportunistic sources without strict quality of service 
guarantees ("best effort"). 

Unless a retransmission is scheduled, the primary source 
Sp accesses the channel in each slot to transmit a fresh 
packet with fixed probability a, with 0<a<l. The secondary 
source is assumed to be backlogged, i.e., it always has a 
packet to transmit. However, a packet arrival process at the 
secondary source can be included in the model with some 
straightforward modifications to the analysis of the following 
section. Nevertheless, its inclusion does not add any insight 
to the discussion presented in this paper, while it complicates 
the formulae. 

The channel access strategy of the secondary source follows 
a policy /i, whose action set is U — 0, 1, where and 1 
correspond to a silent and a transmitting source, respectively. 
We remark that transmission by the secondary source increases 



the probability of decoding failure at the primary receiver. 
Thus, the transition probabilities of the Markov chain strongly 
depend on the secondary user's activity and there is an explicit 
dependence between the stochastic characterization of the 
primary source activity and the activity of the secondary 
source. 

The following discussion is specialized to a constraint 
defined on the throughput loss of the primary source. A 
constraint posed on the increase of the failure probability only 
results in a different definition of the average primary source's 
cost, as reported in Section IIV-BI 

The throughput achieved by the primary source under policy 
H can be written as 

1 ^ 

>Vp(/i)= lim sup — ^E[/(SJ^(M))]i:p, (8) 

A'— f+oo Tiv ^ — ^ 
n—1 

where Lp is the size, in bits, of the packets sent by the 
primary source, r is the duration of a slot, 5p(/x) is the 
event corresponding to a successfully delivered packet by the 
primary source in slot n, I is the indicator function and E 
denotes average. The throughput of the secondary user admits 
an analogous expression. 

The goal of the secondary source is to maximize its own 
achieved throughput while limiting throughput loss to the 
primary source. In particular, let us denote as /ip the policy by 
which the secondary source never transmits. The optimization 
problem can be written as the following infinite horizon 
constrained Markov decision process fTS\: 

/2=argmin Js(^) s.t. yVpifJ.o)-Wp{fi)<a, (9) 

where Js(m) is the average cost incurred by the sec- 
ondary sourcell and can be computed as Js{lA = 
Lsl T—Ws{n). For two arbitrary policies jii and /i2, we refer 

^In the following we will denote by ^Tp the analogous cost defined for the 
primary source. 



to Wp(^i)-yVp(^2) as A(/ii,^2)- Note that the throughput 
loss can be also defined as the difference of average costs 

A(/xi,^2)=Jp(M2)-Jp(a*i)- 

According to [19], the solution of the optimization problem 
(|9]l is a past-independent randomized policy. Moreover, as the 
number of independent constraints is equal to one, then, in 
the optimal stationary policy, randomization occurs in at most 
one state, i.e., the map is either deterministic in all states or 
deterministic in all states except one in which the decision is 
randomized. 

III. Markov Chain and Optimization of the 
Network 

The state of the network can be modeled as a homogeneous 
Markov process 0={0i, 02, . . .} taking values in the state 
space X={0,1,...,T}, where e„=0 and Qn^O, 1<0<T, 
correspond to Sp not accessing the channel and performing 
the 6'-th transmission of a packet in slot n, respectively. Since 
the secondary source is backlogged and transmits each packet 
only once, its status is the same in each slot and we do not 
need to account for it in the model. A graphical representation 
of the Markov chain is depicted in Fig. |2(b)| 

It can be shown that the solution of the problem in (|9]l is 
a randomized past-independent stationary policy |19|. Thus, 
the policy /i maps the state of the network 6'gA' to the 
probability that the secondary source takes the actions in U. 
The action selected in the time slot n is referred to as Un^U 
in the following. We define /i(6', u) as the probability that 
the secondary source takes action u when the network is in 
state 9. As U is binary, the policy can be defined as the 
vector k={ko7 «^i7 • ■ • : '^t}, where Q<hio=^{0,l)<l. Kg is 
the probability that the secondary source accesses the channel 
when the network is in state 6. Policy /io corresponds to the 
all-zero vector 0_. For the sake of simplicity, in the following 
Lp/t=Ls/t is set to one. 

The transition probability from state 6 to state 
Q, Q'gX, conditioned on the action u is defined as 



strategy. Thus, from state 0, the network moves to state 1 (new 
packet in the buffer of Sp) with probability a, and remains 
in otherwise. In each state 9, 1<9<T, the network moves 
to state 9+1 if a failure occurs, while it returns to or 1 
according to the arrival probability a if the primary packet is 
successfully delivered. From state T, the transmission of the 
current packet is terminated regardless of failure or success, 
and thus the network returns to state and 1 with probability 
1—a and a, respectively. 

The steady-state distribution tt^ of the Markov chain is the 
solution of the following system of equations 



T-l 

^^(0) = (l-«)7r^(0)+(l-a)^ {l-pt)7r^{t)+{l~a)7r^{T) 

t=i 

7:^{9) = pg7:^{9 - 1) foT2<9<T, (12) 



with the normalization condition 7r^(l)=l — 7r^(0) — 

As intuition suggests, states corresponding to a larger num- 
ber of transmissions are hit by the process a smaller number of 
times with respect to those associated with a smaller number 
of transmissions of the same packet, i.e., 7r^(6'+l)<7r^(^^) for 
any 9>Q. In fact, the process enters state 9+1, 9>0, only 
by passing through 9. This can be observed in Eq. (fT2l) . 
by which we get Tr^i{9)—nf^{l)Yl^^^i Pi, for 2<9<T, where 
Y[i=i Pi^l- The steady-state distribution is 
1—a 



a 

ne-i 
t=i Pi 

1 + a n*=i Pi 



for 2<t<T. (13) 



The average cost of the primary source can be rewritten as 

Jpip^}=Y.eex '^t^i^)'lpit^^^)^ where ^p{p,9) is the average 
cost collected by the primary source in state 9 under policy /i. 
The average cost difference A(/ii,/i2) is equal to 



C.(e,e')-P{e„+i = e'|e„ = e,u„-^/}, (lo) A(/ii,A*2)=5](^M2W7p(A'2,f?)-7r^i(mp(m,^))- (i4) 



and does not depend on n. Note that since the policy fi is 
past-independent, then the stochastic process which models 
the temporal evolution of the network is a Markov process. 
We remark that due to the mutual interference the probability 
that the Markov process transitions from one state to another 
in the state space depends on the action taken of the secondary 
user. 

The transition matrix of the chain, which collects the 
transition probabilities C,^{Q,Q'), is 



3&X 



Different policies result in different average costs collected in 
each state, but correspond to different steady-state distributions 
as well. 

The average cost in state 9 can be computed as 



(15) 



1-a 
(l-a)(l-pi) 



a 

a(l-pi) 



(l-a)(l-pT-i) a(l-pT-i) 
(1— a) a 





Pi 










(11) 



where pg depends on the transmission probability of the 
secondary user Kg. and represents the failure probability of 
the primary source in state t conditioned on the transmission 



where 'yp{d,di) is the cost incurred by Sp during the tran- 
sition from 6 to 9i. The cost 7p(0, 0i) is equal to zero 
for the transitions in which a packet is successfully deliv- 
ered and to Lp/t=1 when a packet incurs failure]^ Note 
that when d=T, the cost of any transition is prLp/r—pT. 
The throughput can be similarly defined as the sum of 
the steady-state distribution weighed by the average rewards 
u]{p,9)—Lp/T—jp{fj,,d)~l—jp{p,9). Analogous definitions 
can be stated for the secondary link. In the following, with 

^^We recall that, without any loss of generality, Lp/r is set to unity. 
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Figure 3. Graphical representation of the failure probability increasing factor 
associated with the interference generated by the secondary source to the 
primary source's transmission. 



a slight abuse of notation, we denote the average cost and 
reward from state when the action u is selected as j{u,6), 
u}{u,d), respectively. 

The average costs in the various states are trivially 
7p(M,0)=Lp/r=l and7p(^,6')=peip/T=pe, l<e<T. The 
average cost of the primary source can be thus written as 



Jp(p) 



1+" ELY UUl 



<1, 



(16) 



with p={pi, . . . ,pt}- 

The interference by the secondary source in a certain state 
has two effects on the performance of the primary source: 

• if 9>Q, interference increases the instantaneous cost 
collected in that state by the primary source; 

• if Q<9<T, interference increases the probability that the 
process moves to 9+1. 

Clearly, transmission by Ss in state does not have any effect 
on the primary source, while in T it only increases the cost 
associated with that state, as the packet being served by Sp is 
discarded after this transmission. 

As observed in the Introduction, if the secondary source 
transmits in a state 9, with 0<9<T, the average number of 
transmissions of the packets of the primary source increases. 
This means that the fraction of time spent by the primary 
source in the idle state decreases. By interfering with the 
primary source, the secondary source is then decreasing the 
number of idle slots, that is, the white spaces reduce. 

The average failure probability of the primary source in 
state 6>0, conditioned on the policy, is pe—{l—Kg)p+ngp*. 
In fact, when in state 9, the primary source incurs a failure 
probability equal to p* if the secondary source transmits and 
equal to p if the secondary source does not transmit. 

In order to provide a more intuitive explanation of the 
dependence between the decoding performance degradation at 
Dp and transmission by 5*3, we define the failure probability 
increasing factor A, such that p*=p+{l—p)X. Thus, A deter- 
mines the impact of transmission by the secondary source on 
the decoding probability at the primary receiver: the larger A, 
the closer to one the probability of failure. In particular, for 
A=0 and A=l, the failure probability at the primary source is p 
and 1, respectively (see Fig. [3] for a graphical representation). 



The resulting average failure rate in 9 is pg=p+{l—p)XKe. 
Note that A parameterizes the difference between the failure 
probability with and without interference from the secondary 
user and does not presume the use of a linear model for the 
failure probability as a function of the interference power 

Consider kq, the probability that the secondary source 
transmits when the primary source is idle. As increasing kq 
does not affect the cost to the primary user (which is idle), 
the optimal value for kq is oneO Thus, in the sequel, we set 

The average cost collected by the primary source is then 

_ {l-a)+a UU{P+i^-p)>^^^) 



Jp(^)=- 



l+«ELYnLi(P+(l-p)A«,:) 



(17) 



The average throughput achieved by the secondary source, also 
referred to as the reward in the following, can be computed 
as in Eq. ( fTSl ). 

The optimization problem (|9]) is equivalent to the following 
linear program (LP) |fT9l 



Z = argmax '^^'^^uj{u,9)zu(9) 



(19) 



s.t. }_^}_^j{u,9)zu{9)<a+Jp{po) 

eexueu 

E^"(^i)=E E^«wc«(^'^i)' 

u&A eeXueu 
Zui9)>0, Vm, 9, 



where Z—{zu{9)}eex.ueu, and Zu{9) represents the joint 
probability that the Markov chain is in state 9 and action u 
is selected. The first constraint bounds the maximum perfor- 
mance loss of the primary user, while the others force the 
solution to be a valid stationary distribution for the Markov 
chain. 

The LP defined above thus optimizes the steady-state dis- 
tribution of state-action pairs. The involved expression of 
the average reward and cost functions defining the objective 
and constraint of the original problem are thus translated 
into linear combinations of the optimization variables. [f| The 
condition for optimality is that the Markov chain under all the 
policies is unichain ll20l . i.e., it has a single recurrent class and 
an arbitrary number of transient classes. This property holds, 
in our case, for any policy and any set of parameters as defined 
throughout the paper. 

The optimal poUcy is then p{9,a)=z'a{9)/{^^Zu{9)) if 



E 



,{9) = 1, i.e., 9 is recurrent. If Euew -^^{9) — 0, i.e.. 



lieu ^« 



^This may not hold if we consider more complex networks or energy 
consumption metrics. This is left for future research. 

'a constraint on the failure probability can be formalized as a linear 
constraint as well through straightforward manipulation. 



Ws (k) = 7r„(0)Ato(l - i^) +^7r,(0)K(,(l - v*) 



{l-a){l ^ 



i+«eLY n-=i(p+(i-p)A«.) 



(18) 



is transient, then the map in 9 is ^(0, a) — 1 for a randomly 
chosen a €U and /i(6', a) = otherwise. In the model at hand, 
which considers a binary action whose randomization corre- 
sponds to the probabihty that the secondary source transmits 
given that the transmission probability Kg simply corresponds 
to ^{6, 1). Note that X)e=o -^il^) XoidA fraction of time 

in which the secondary source transmits. 

It is also shown in [19] that the number of randomiza- 
tions, i.e., the number of states in which the policy is non- 
deterministic, is equal to or smaller than the number of 
independent constraints in Equation ( fT9] l. Thus, in the model 
at hand, the optimal policy found via the above LP is non 
deterministic in at most one state and the optimal vector k is 
a vector with A^i ones, A^o zeros and Nr elements in (0, 1), 
with Ni+No+Nr^T+l, 0<Ni<T, 0<No<T, and iV^^l or 
0. The space of the vectors described by the above conditions 
is denoted in the following with A4r- 

In the following Section, we will show that, if iy*=u, 
the optimal policy k has a precise structure that enables its 
calculation through a simple algorithm, thereby avoiding the 
need to solve the linear problem stated before. In particular, 
the optimal policy concentrates transmissions by the secondary 
source in the first transmissions of the primary source packets. 
Therefore, the Ni unit elements and the A^o zero elements 
are the first Ni and the last A^'o elements of the vector k, 
respectively. If Ni+Nq=T—1, then randomization occurs at 
the A^i+l-th state, otherwise the policy is deterministic. 

As a side comment, we observe that in a pure collision 
scenario, where the failure a policy such that kq~1 and K,g=0, 
{)<9<T, is optimal. This is the white spaces approach. In 
fact, if /9p and pg are both set to one, the secondary source 
gains nothing when transmitting concurrently with the primary 
source, while increasing the cost of the latter. In general, if 
the secondary source bases its strategy on channel sensing 
only it can distinguish between an idle slot (0=0) and a 
non-idle slot (O<0<T). The resulting strategy assigns the 
transmission probabilities ko=1 and Kg=K, \/0<9<T. We will 
show through numerical results that this policy is suboptimal. 

IV. Optimal Transmission Strategy for the 
Z-Interference Channel 

In this Section, we address the structure of the optimal 
transmission strategy in the particular case in which 1^*=^, that 
is, the transmission by the primary source does not affect the 
successful decoding probability of the packet of the secondary 
source by the secondary receiver. 

This assumption can be referred to the well-known Z- 
interference channel framework, where the interference link 
between the primary source and the secondary destination is 
removed. We observe that this does not mean that the interfer- 
ence channel between the primary source and the secondary 
destination is simply removed. For instance, this model also 
fits the case in which gps^gss with high probability, or 
the primary source transmits with a rate Rp sufficiently low 
to allow the secondary destination to decode and cancel the 
interference from the primary source with high probability. 

In this case 1/*=!^, and thus the failure probability at the 
secondary destination does not influence the solution of the 



optimization problem. In fact, the success probability l—v 
only represents a scaling factor for the reward achieved by 
the secondary source. Thus, in the following, with Wsin) 
we refer to the normahzed reward Ws(k)/(1— i^). We remark 
that the optimal policy k when maximizing the reward or the 
normalized reward of the secondary source is the same, and 
that the throughput is simply the normalized reward multiplied 
by the success probability. 

A. Structure of the Optimal Policy 

In the following, we show that the optimal transmission 
policy for the secondary source when v=v* has a specific 
structure. The transmission strategy maximizing the through- 
put of the secondary source, given the constraint on the pri- 
mary source's throughput loss, concentrates interference in the 
first transmissions of each of the packets sent by the primary 
source. The policy has the structure described in Theorem |5] 
where the secondary user transmits with probability 1 in states 
9<Ni, probability kat^sP, 1] in state A^i and probability 
equal to zero in states 9>Ni. The values of A^i and kjvi 
are functions of the parameters of the system and of the 
throughput constraint. It can be shown that the same structure 
applies if the constraint is on the failure probability of primary 
source's packets. The definitions and proof for this last case 
are provided in Section IIV-BI 

As discussed before, interference from the secondary source 
in different states has a different effect. In fact, if the secondary 
source increases its transmission probability in state j, with 
0<j<T, it also increases the average failure probability pj. 
This means that the primary source fails more often in the 
j-th transmission of a packet. Therefore, the Markov process 
hits more frequently the states with indices larger than j, and 
less frequently all the other states, that is, the steady-state 
probability 7r^(t) of the states t>j grows, while the same 
probability associated with the states t<j decreases. 

Moreover, as observed before, if j<r, then the steady-state 
probability associated with state j is larger than that of state r. 
Thus, if the secondary source increases its transmission prob- 
ability in state j, it increases the overall level of interference 
more than if the same increase is applied to state r. Thus, 
the state in which the interference is increased influences the 
bias on the stochastic process of the primary source, due to 
the activity of the secondary source as well as the overall 
cost incurred by the former. The normalized reward of the 
secondary source counts the fraction of slots in which the 
secondary source transmits. Since 7r^(j)>7r^(r), the overall 
reward grows more if the transmission probability is increased 
in state j than if the same increase is applied to state r>j. 
On the other hand, note that if kj is increased, then 7r^(j) 
decreases. Nevertheless, we will shown in the following that, 
if i'*=i', an increased transmission probability in any of the 
states results in a larger secondary source's throughput. 

The main intuition behind the structure of the optimal 
transmission policy is that, when considering the same increase 
of the transmission probability, the reward of the secondary 
source grows faster than the cost of the primary source. More- 
over, the difference between the increase of the reward of the 



secondary source and the increase of the cost of the primary 
source grows much faster if the transmission probabiHty is 
increased in state j, with respect to the same quantity measured 
if the transmission probabiUty is increased in state r>j. Based 
on these observations, it is possible to show that among the set 
of the transmission poHcies resuhing in the same cost for the 
primary source, the one that most concentrates the interfering 
transmissions in the first transmissions of the primary source's 
packets achieves the optimal throughput. 

We first state the following theorems: 

Theorem 1: Jp{k) is a strictly increasing function of Kg, 

with e>oE 

Theorem 2: Ws(^) is a strictly increasing function of Kg, 
with 0>O. 

Formal proofs of these Theorems are provided in Appen- 
dices |A] and |B] 

Theorem [T] states that the cost of the primary source 
increases as the fraction of slots in which the secondary 
user accesses the channel gets larger This is rather intuitive, 
as a larger amount of interference cannot result in a larger 
throughput for the interfered link, at least in the framework 
considered herein. 

Theorem |2] states that the average throughput of the sec- 
ondary source increases as the fraction of slots in which it 
accesses the channel gets larger Although this result also 
agrees with intuition, it must be observed that transmission 
by the secondary source in a certain slot also modifies the 
steady-state distribution of the Markov chain for the primary 
source. For instance, the steady-state distribution of state 0, in 
which the secondary source can always transmit, decreases as 
Kg gets larger, with O<0<T0 However, the theorem states that 
the gain outweighs the potential loss under the assumptions on 
the decoding failure at the secondary receiver stated before. 

The previously stated theorems guarantee that the optimal 
policy lies in the space of policies where the constraint on the 
primary throughput loss of Eq. ^ is active, i.e., A{0_,K.)~a, 
unless ^{0_, J_)<cr, where J_ is a T+l-long vector whose 
elements are all ones. In fact, in this latter case, the secondary 
source transmits in all the slots with probability one, and, if 
this policy results in a cost for the primary source smaller 
than the maximum admitted, then a policy that activates the 
constraint does not exist. Moreover, under the policy J_, the 
secondary user achieves the maximum possible throughput, 
i.e., Ws{J_)=Ls/t=1^ Thus, if J_ is admissible, then it is 
also optimal. 

Let us consider now the case A{0_, J_)><t and define as a 
T+l-long vector of all zeros except for the i-th element that 
is equal to one, 0<i<T. 

Consider a policy k/^J_ such that ^{0_,K^)<cr- Since 
Jp is continuous, there exist 6>0 and 0<j<T such that 
A{0_,k")<(t, where K"—K/+Uj6. Due to Theorem |2] the re- 
ward achieved by policy k" is larger than that achieved by pol- 

^We remark that the cost is independent of kq, whose value has been set 
to one by assumption. 

'in the case d=T, transmission by the secondary source does not modify 
the steady-state distribution. 

*We recall that, throughout this section, we normalize the throughput of 
the secondary source normalized to the success probability l—u=l—u''. 
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Figure 4. Policies k' and k" as defined in Theorems [3] and |4] 

icy k', i.e., Ws(K")>yVs(K'). Note that for any S>0, we also 
have Jp(k")>Jp(k'), and thus cr-A(0, K")<cr-A(0, k'). 

Thus, for any pohcy k' resulting in a maximum performance 
loss below a, there exists an admissible policy k" such that 
the secondary user achieves an improved throughput, while 
the cost of the primary user increases. Any policy k' such 
that A(0^,k') is strictly smaller than <t is thus non-optimal. 

As a consequence of the previous statements, if the problem 
in (|9]l is feasible, then the optimal policy k lies in the space 
X^={k:A(0,k)=ct}U{1}. 

We now formalize the intuition discussed before by stating 
the following theorem: 

Theorem 3: Consider a policy k such that Kj=Kr and 
Kg^O, \/9>r and with Q<i<r<T. 

Define the two policies k' and k" as K^=K+UjSj and 
k"=k+u^S';, with 0<S'j<l-Kj and 0<S';<l~Kr (see Fig.g] 
for a graphical representation). If Jp{k')—Jp{k!') then 
>Vs(Ai')»Vs(K")- 

The proof of the theorem is provided in Appendix ICl 

Theorem [3] states that, starting from a policy k respecting 
the hypothesis, if the policy obtained by increasing kj and 
the policy obtained by increasing Kj. incur the same average 
primary source's cost, then, if j<r, the reward associated with 
the former is larger than the reward associated with the latter. 
As discussed before, this result is due to the difference between 
the reward and cost increase corresponding to an increased 
transmission probability in a certain state. This quantity grows 
faster if the transmission probability is increased in state j with 
respect to state r, with j<i'. 

Similarly, it can be shown that if the policy obtained by 
decreasing kj and the policy obtained by decreasing Kr result 
in the same average primary source's cost, then the reward 
achieved by the latter is larger than the reward achieved by 
the former. Formally: 

Theorem 4: Consider a policy k such that Kj^Kr and 
Kg^O, \/9>r and with 0<j<r<T. 

Define the two poHcies k' and k" as K'=K—Uj5j and 
k/'^k~u,.S'^, with 0<5'^<Kj and 0<(5"<Kr (see Fig. S] 
for a graphical representation). If Jp{k/)=Jp{k") then 
>Vs(^i')<Ws(k")- 

The proof of the theorem is provided in Appendix iDl 

Theorem |3] and |4] are the basis for the derivation of the 
structure of the optimal policy k, defined by the following 



theorem: 

Theorem 5: The optimal poHcy k has the following struc- 
ture 

'Bl=[Lnii I^Ni, OjvJ, (20) 

where XjVi '^^'^ Q-Nq ^""e vectors of iVi ones and A^o zeros, 
respectively, and 0</t7Vi<l- 

Thus, the optimal policy concentrates transmission by the 
secondary source in those states associated with the first 
transmissions of primary source's packets. Intuitively, if the 
interference generated by the primary user's transmission has 
a small impact on the reception of secondary user's packets, 
then the difference between the secondary user reward increase 
and the primary user cost increase corresponding to an increase 
of the transmission probability in the early retransmissions 
is positive and larger than that corresponding to the same 
increase in the late retransmissions. In fact, the throughput 
achieved by the secondary user is not affected by the access 
rate of the primary user, and thus, a transmission probability 
increase corresponds to a positive reward in all the states. 
Moreover, the cost increase of the primary user accounts 
for the fact that additional primary user's retransmission due 
to secondary user interference take place in otherwise idle 
slots with a positive probability, that is, there is a positive 
probability that retransmissions do not affect the throughput 
of the primary user This reduces the cost increase speed in 
the early retransmissions of primary user packets and results 
into the unique structure of the optimal transmission policy 
discussed before. 

We remark that the optimal transmission strategy of the 
secondary user is defined under the constraint on the max- 
imum performance loss of the primary user. Therefore, the 
transmission probability in all the states 0>O is bounded 
by the constraint. We also observe that the transmission 
strategies proposed in prior literature addressing cognitive 
networks do not consider the long term impact of interference. 
Therefore, these strategies may fail to guarantee the minimum 
performance to the primary user in those scenarios in which 
the primary user implements protocols and mechanisms which 
react to interference and packet failure. 

Theorem |5] has a very intuitive proof, sketched in the 
following. As observed before, if the problem in Eq. (|9]l 
is feasible, then the policy lies in the space of policies 
A^^={k:A(^, K)=cr}U{X}- Moreover, according to fT9l, the 
optimal policy is a randomized policy with randomization in 
at most one state. We recall that the space of transmission 
probability vectors associated with those policies, i.e., the 
space of the vectors with iVi ones, A^o zeros and Nr elements 
in (0, 1), with 0<Ni<T, 0<Na<T, and 

Nr—1 or 0, is denoted with A4r- Therefore, the optimal 
transmission probability vector lies in the space Air^Ma- 

If 1_ is admissible, then it is the optimal policy and Theo- 
rem |5] holds with 7Vi=T+l, A^o=0 and k.Ni=1- 

Assume now that 1_ is not admissible, i.e., ^{0_, J_)><t. If 
the optimization problem is feasible, then there exists a policy 
K^--^^ ^Airt^Aia- Starting from it is possible to construct 
a sequence of policies k'^-*, k^^^ . . . in M.r^M.^ such that 
Ws(k''^+^))>Ws(k*-'^^) and converging to the optimal policy 



where k'^'^"'"^) has the structure described in Theorem |5] 
Consider a poHcy k''^^ G Mr^AAa and fix r = 
max{6':K^''^ > 0}, i.e., r is the largest state with a non-zero 
transmission probability. 

(k) 

Assume kJ. <1, i.e., randomization occurs in state r. Then, 
the policy in any state 6^r is either or nf^—Q. 

If 30:9<r,Kg''^ =0, i.e., the transmission probability in 6 is 
zero, then define k('"'+^^=k*^'^^+u^ (5j — m^kI'^'' (Fig.|5]a), where 
j~mm{6 : ^^''■'=0} and with 6j>0 such that 

Jp(k('=)+u/,-w,4'=)) = Jp(k('')). (21) 

We observe that such a 5j always exists, due to the 
continuity of the cost function. Theorem [T| and the fact that 

djpinj/dnj > (9Jp(K)/(9K^0Note that Ki''+^^ eMrHM^. 
Moreover, Ws(k(''+i))>>Vs(^^''^). In fact, define the 
policy k'^k^^^ -Uj.Kr ■ Thus, K^^^=K*+u^K"r^ and 
K^^+^'^^K*+Uj5, (Fig. |5]b). Since k*=kI=0, K*g=0, y0>r 
and J7p(k^'^))=Jp(k^'^+^^), then, according to Theorem [3] 
Ws(k('=+i))>Ws(k('')). Note that k^'^+i) is obtained from 
K^*^^ by draining transmission probability in state r, and 
pumping it into state j<r. In fact, k'j'^^^ —5j>K^^''^ —0 and 

If Kr'^'^l then there may exist a state 9<r such that 
0<Kg <1, i.e., the randomization occurs in state 6. If such a 
state does not exist, i.e., the map in all states 0<r is determin- 
istic, and there exists instead at least one state 9:9<r,hif'^ =0, 
then fix j=min{6' : Kg''^=0}. k^''^^^ is then constructed from 
K^*^^ as described before, and via the same considerations it 
can be shown that it achieves an improved throughput. 

If ni'^^^l and 39<r : OKn'g'^ <1, then we fix i=9. If there 

(k) 

exists 0<Or<Kr —1 such that 

JpiH^"'^ +Uj{l-l^f^)-UrSr) = Jp(^W), (22) 

i.e., there exists a policy obtained by decreasing the 
transmission probability in r and setting the transmis- 
sion probability in j to unity which incurs the same 
average cost of policy k^'^K then, is defined as 

^(fc+i)^^(fc) (see Fig.EJb). We then de- 

fine the policy k*=k^''^ +Uj{1-k'j''^). Thus, since k*^k*~1, 
Ki^)^K*-Uj{l~Hif^), K^^+^^^tst-u^Sr, Kg^O, V6i>r and 

Jp(k* - u^{l-nf )) = JA^*-uM. (23) 
then, due to Theorem |4] 

m{ti^^+^^)>Ws{K,^^^). (24) 
Assume now '^5^ : 0<dr<Ki''^ = 1 such that 

Jp(^W +u^il-nf)-u,A) = Jp(^«), (25) 

(k) 

i.e., the cost obtained by setting k!j —1 and nulling the trans- 
mission probability in state r is larger then Jy{h^^^)- In this 

'This intuitive inequality, not proved herein, can be derived by using the 
expression for the partial derivatives reported in Appendix IaI 
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Figure 5. Policies and 



case, the policy is defined as k^'^+^^^k'^'^^- 

(see Fig.|5]c), with OKS^Kl-nf^ and 



(26) 



Therefore, the policy is obtained from k'*^) by setting to 

zero the r-th element of the vector and increasing accordingly 
the j-th element. We show in the following that the reward 
achieved by policy k^'^'"'"^-' is larger than that achieved by 
policy K^*^) also in this case. The general problem of op- 
timizing Kj and Kr given the transmission probabilities in 
all the other states (set according to k^*"'^) can be seen as 
a reduced version of the linear program iT% . The optimal 
solution of this reduced problem is again a randomized policy 
with randomization in at most one state, i.e., at least one 
between kj and k.^ is set to either unity or zero. In the case we 
are considering, the only two reduced policies, corresponding 
to pairs {kj , k, ), with at most one randomization activating the 



constraint on the maximum performance loss are (k 



(k) Jk) 



and {kj ,Kr ') as defined before. The solution of the 

reduced LP is then either (k^ Kr*^^) or {kj''^^\kI''^^''). 
Fortunately, Theorem |3] ensures that there exists at least one 
policy achieving a reward larger than k^'^') with the same cost. 
Therefore, k^''^ is suboptimal, and is optimal. Define 

the policy Thus, k*=k 

be shown that there exists S',, with 0<(5' <1- 



Jk) 



(k) T, 

Kj ' . It can 
such that 



Jp(k')-Jp(k' 



(27) 



where k'=k*+Mj(5^. Since k*=k*, Kg=0, \/9>r, and the 
above equalities, according to Theorem |3] we have 



(28) 



As a consequence, {k 



(fc+i) (k+i). 



is the optimal solution of 



the reduced LP introduced above, and policy achieves 
the maximum reward given the constraint and once fixed the 
other transmission probabilities. 

In all the cases presented, the transmission probability is 
drained from state r and pumped into state j<r. Note that it 



is possible to continue the iterations as long as there exists a 
pair {j,r) : j<r, <k^\ If such indices cannot be found, 
the iterations terminate with the policy K'^'GA^rHA^cr. It can 
be easily seen that the iterations terminate with the unique 
policy in Mr^M.^ characterized by the structure indicated in 
Theorem |5] i.e.. 



N„ 



(29) 



where Xtvi 8-No vectors of iVi ones and A^o zeros, 
respectively, and 0<kjvj<1. 

Since from any policy K^^^GA^^nA^cr the 
iterations produce a policy K^^^GA^rHA^a such that 
VVs(k(^))>>Vs(k'^^'), then k^^) is the optimal policy, i.e.. 

Theorem |5] besides unveiling an important feature of the 
optimal interference control strategy in retransmission-based 
systems, also has an immediate practical meaning. In fact, the 
optimal policy can be computed through a simple algorithm 
that generates a sequence ... of at most T policies 

terminating with k. 

Let us fix K^^) = 1_. If ^{0_,f)<(T, then the optimal policy 
is K = X- Otherwise, if A(0., k^^^ ~ ILt) — the algorithm 
terminates with the optimal policy k = k*^^) — UrpSx, where 
5t is the unique solution of A(0., S) = a. If instead 

A(0., K^^) — Uj,) > cr, the algorithm sets k^^^ = k^^^ — Uj., 
and continues with the next iteration. 

Similarly to the previous step, if A(0^,k(^^ — ILt-i)^'^ 
the algorithm terminates with the optimal policy k=k(^) — 
Ult-i^t-i, where St-i is the solution of A{0_,k^'^^ — 



_iS)=a. Otherwise, the algorithm sets 



iT-1 



and so on. 

Thus, the algorithm sequentially evaluates the variables kj 
in decreasing order from T and terminates with the optimal 
policy as soon as it finds the first non-zero element. 

The structure of the optimal policy leads to another im- 
portant observation. Consider a secondary source adopting 
a sensing approach, such that it always transmits when the 
channel is sensed idle, and transmits with fixed probability 
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Figure 6. Graphical representation of a) the policy in which the secondary 
source accesses slots in which the primary source transmits with fixed 
probabiHty, and b) the optimal poHcy. 



K when the channel is sensed busy (see Fig. |6la). Thus, the 
secondary source transmits with probability equal to k in all 
the states 0, with 0<6<T. We call this strategy horizontal 
flooding, meaning that the secondary source equalizes the 
transmission probability such that it reaches the same level 
in all the states in which the primary source transmits. 

The optimal transmission strategy defined above determines 
the transmission probabilities in the various states of the 
Markov chain under the constraint on the maximum through- 
put loss of the primary user As the constraint becomes tighter, 
the water level of the secondary user is drained from the upper 
states in Fig. |6]b, corresponding to later retransmissions of 
primary user's packets in order to reduce the impact of the 
activity of the secondary user on the primary user's throughput. 
Note that if 6=0, the secondary user is always silent unless 
the primary user is idle. The optimal policy corresponds to 
a vertical flooding, where states with a smaller index are 
flooded with water, i.e., transmission probability, first (see 
Fig. |6]b). The horizontal approach, while sometimes simpler 
to implement, is suboptimal due to Theorem |3] 

Finally, we observe that the arrival rate at the primary source 
influences the aggressiveness of the secondary source. Clearly, 
as a decreases, also the average throughput of the primary 
source decreases, as the fraction of time spent sending packets 
decreases. As the elements of k get larger, if a is small, the 
impact of the increased transmission probability is small, as 
the secondary source is increasing its access rate in states with 
low probability. Interestingly, the fraction of throughput lost by 
the primary source decreases as the arrival rate a gets smaller. 



B. Constraint on the Average Failure Probability 

As shown in the previous Section, if the constraint on the 
average performance loss at the primary transmitter is defined 
for the throughput loss, then the optimal policy concentrates 
transmission and interference of the secondary source in the 
first transmissions of the primary source's packets. 

Remarkably, the same structure applies to an analogous 
optimization problem in which the constraint is defined for 
the increase of the failure probability of the primary source's 
packets. 

The average cost is trivially the probability that all the 



transmissions of a packet fail, i.e., 

T 



(30) 



The above expression for (k) is also obtained by assigning 
the following average cost to the various states: 

7p(k,6I) =0 V6i = 0,l,...,T-l (31) 
7P(^1, T) = (p + (1 - p)\nT)/T:Jyl). (32) 

In fact, recalling the steady-state probabilities provided in 
Eq. (fTSl l. the resulting average cost is 

Mr), 



J^\^)^Y.^.{0)lp{^.e) = ^^{p+{^-p)\^^T) (33) 

T 

^Y{{p+{1- p)\nt). (34) 



Intuitively, the failure probability is the ratio between the 
fraction of slots in which the process is in state T and a 
packet fails, i.e., 7rK(T)/9T0 and the fraction of slots in which 
the process starts the transmissions of a new packet, i.e., 
7r^(l). While the average throughput can be expressed as time 
average of a sampling function (see Eq. dHJ), the packet failure 
probability is then the ratio of the time averages of sampling 
functions associated with state F and state 1 multiplied by the 
failure probability in state F. 

The cost in state T is, thus, a function of the steady-state 
distribution. The optimization problem can be reduced to a 
Linear Program also in this case. In fact, the constraint on the 
packet failure probability 



7r«(l) 



PT^<y, 



(35) 



can be rewritten as T^JyF)pT—TTiJX}(y<Q. 

Note that the structure of the cost function is significantly 
different with respect to the throughput case. In fact, under the 
hypothesis of Theorem |3] while Jp{n+Uj5)>Jp{K+u^5), in 
this case the equality holds, i.e., {!^+Uj5)=J^ {k+u^5). 
Therefore, the overall cost is insensitive to the state in which 
the secondary source increases the transmission probabil- 
ity. More formally, fix j, r and 5, with 0<j<r<T and 
— min(Kj, Kr)<<^< min(l— Kj, 1— k^), then. 



Jpitl+UrS)- 



(36) 



Interestingly, while the cost in terms of failure probability 
is insensitive to the state in which the secondary source 
increases/decreases the transmission probability, the secondary 
source's throughput increases faster if the transmission prob- 
ability is increased in the states with a small index. These 
considerations result in an overall behavior of the reward/cost 
tradeoff analogous to that resulting from a definition of the 
primary source's cost in terms of achieved throughput. Then, 
Theorems [3] and |4] hold for this definition of the cost. A 
detailed proof can be found in Appendix [E] 

'"We recall that Pt=(p + (1 - p)^k-t) 



Note that again transmission by the secondary source in 
state does not have any effect on the cost of the primary 
source, while the reward of the secondary source increases as 
kq is increased, thus the optimal value for kq is one. 

As Jp(k), also the average cost J^{i£) is a strictly in- 
creasing function of any variable ne, with 0>{). Therefore, 
for this constraint also, the optimal policy lies in the set 
{K:A(0,K)=cr}U{l}. 

Since Theorems [3] and |4] hold, then it is possible to construct 
a sequence of randomized policies achieving an improved 
reward and converging to the optimal randomized policy 
defined in Theorem |5] Therefore, the optimal policy has the 
same structure of the optimal policy found for the previously 
considered case. 

Other constraints, as well as other secondary source's 
performance metrics, may lead to a different optimal policy 
structure. For instance, the activity of the secondary source 
may be limited by a constraint on the average number of 
transmission^ of the packets of the primary source. For this 
metric, the average cost of the primary source is 

T-l t 
t=l i=l 

Observe that an increased transmission probability in state j 
increases all the terms of the sum with t>j. 

Similarly to the throughput case, the cost increase associated 
to an increased transmission probability of the secondary 
source in state j is larger than the same increase in state r>j. 
However, the difference between the average costs associated 
with the resulting policies may be larger than in the throughput 
case. Therefore, for some regions of the parameters, the sec- 
ondary source may be forced to concentrate its transmissions 
in the last transmissions of the primary source's packets. 

The optimization problem admits an analogous formulation, 
and it is possible to derive the structure of the optimal policy 
by following a logical procedure entirely similar to the one 
presented before. 

V. Discussion for the General Case 

The structure shown before holds if v*=v, i.e., if primary 
source's transmission does not alter the decoding probability at 
the secondary receiver. The reward collected by the secondary 
source associated with transmission in state or state 0>Q is 
then the same. This assumption may fit some configurations 
of the network and receiver capabilities, e.g., the secondary 
source is much closer to the secondary receiver than the 
primary source, or the secondary receiver can effectively 
decode and cancel the signal from the primary source. 

However, in general, v*>v. In the following the case v*>v 
is discussed. This means that if KQ=Kt, t>0, the average 
reward of the secondary source in state is larger than the 
reward in t. In fact, recalling that ujs{Ke,0) is the average re- 
ward collected by the secondary source in 9 if the transmission 
probability is Kg, we have: 

oJsiKo,0)={l-v)Ka^{l-v)Kt>{l-v*)Kt^CjsiKt,t)- (38) 

"This perfonnance metric is sometimes referred to as delay in the technical 
literature. 



Note that the observations made before on the average cost of 
the primary source remain valid. The average cost is a mono- 
tonic increasing function of the transmission probabilities 
and for 0<(5<1— max(Kj, k^) and 0<j<r<T, the following 
holds: 



(39) 



for any k. 

Interference due to primary source's transmission at the sec- 
ondary receiver makes state more desirable to the secondary 
source. As observed before, interference increases the average 
number of transmissions of the primary source's packets. 
Therefore, the activity of the secondary source reduces the 
fraction of slots spent by the primary source in the idle state. 

Depending on ly*, v, p* and p, an increased transmission 
in a state 0>O may decrease the average throughput of the 
secondary source. Some insights can be extrapolated through 
the analysis of the case T=2, i.e., the primary receiver 
transmits the packets at most twice. The average reward of 
the secondary source is 



(l-a)(l-i/)Ko+(l-i^*)a(Ki+(p+(l-p)AKi)K2) 



I + a{p + {I - p)Xki) 



(40) 



Ws(/i) is a monotonically increasing function of kq, irre- 
spective of Ki and K2. In fact. 



dWsjK) _ {l-a){l-v) 
Okq 1 + a(p + (1 — p)Aki) ■ 



(41) 



which is trivially positive for any admissible set of parameters. 
The secondary source's transmission in state does not modify 
the transition probabilities of the Markov chain. Therefore, any 
increase of kq corresponds to an increased average reward, and 
since it does not influence the cost, again it is optimal to set 

Similar considerations apply to K2, and, more generally, to 
transmission in state T. We obtain. 



dK2 



= l-V* - 



1 - V* 



1 (1 - p)\ki)' 



(42) 



which is positive independently of /to and ki. 

Transmission in state 1, instead, alters the transition proba- 
bilities, and increases the fraction of time spent by the primary 
source in state 2, while reducing the time spent in states and 
1. The total time spent in the absence of interference from the 
primary source, which is, 



1-a 



l + a{p+{l - p)Xki)' 



(43) 



decreases as ki is increased. If the secondary receiver incurs 
a high failure probability when decoding a signal interfered 
by the primary source, the average reward of the secondary 
source may suffer because of the larger average number of 
transmissions of the primary source's packets due to trans- 
mission in state 1. The derivative dWs{!i)/dKi is shown in 
Eq. ( l44l i and is positive if i^* is smaller than the threshold in 
Eq. gDUl 

'^kq is set to one in the equations. 



dWsiK) a{l - \{1 - K2 - a-v + aiy){l - p) + ap - + (1 - p)Xk2 + ap)) 



(l + a(p+(l-p)AKi))2 

A( — 1 + K2 + ce + v — av){—l + p) + ap 
1 + (1 - p)Xk2 + ap 



(44) 
(45) 



There are thus regions of the parameters and transmission 
probabiHty K2, such that an increased transmission probability 
in state 1 resuhs in a smaller average reward. Note that an 
upper bound for 9>Vs(k)/9ki is obtained by setting K2 = 1- 
In fact, transmission in state 1 increases the steady-state 
probability of state 2, while transmission in the latter state 
does not modify the steady-state distribution^ It is easy to 
see that the threshold in Eq. ( |45] |, if computed with K2=1, 
becomes smaller than or equal to 1 for any admissible set 
of parameters. Therefore, there exists a region of parameters 
such that the derivative of the average reward with respect to 
Ki is negative. In this region, any throughput-optimal policy 
sets Ki=0. The optimal policy may, therefore, have a different 
structure than the one shown before for the case h'*—!^. In 
particular, note that the optimal policy may not belong to the 
set of policies {k : A(0^,K)=cr} UX^ '-e-. the optimal policy 
may provide a performance reduction to the primary source 
smaller than the maximum allowed. 

In general, if i/*>i' the mutual interaction between the 
activity of the secondary source and that of the primary source 
becomes more involved, and it is hard to provide a structure 
for the optimal policy. Intuitively, the larger ly*, the smaller the 
transmission probabilities in states 9=1, 2, . . . , T, as the sec- 
ondary source may maximize its own throughput by preserving 
the steady-state probability of state 0. The same reason may 
force the secondary source to concentrate its transmissions in 
the states corresponding to the last transmissions of a primary 
source's packet. 

Numerical results illustrating the above discussion are 
shown in the Section IVIII 

VI. Online Approaches: State Observation and 
Model Knowledge 

The resolution of the linear program of Eq. (fT9] l necessitates 
the knowledge of the transition probability kernel as well 
of the cost functions. However, it can be observed that a 
relatively small number of parameters (the failure probabilities 
p, p*, V and V* , and the arrival probability a) determine the 
transition probability matrix and the cost functions. Therefore, 
the estimation of the statistics of the stochastic process and 
of the cost functions is faster than in a totally unstructured 
environment. 

The realization of the policy requires the perfect identifica- 
tion of the state of the primary user. In the network considered 
herein, the estimation of the state within the state space can 
be obtained by combined channel sensing and packet header 
decoding. In fact, the secondary user can distinguish state 
from any other transmission state 8>0 by sensing the 
channel and detecting the presence of a signal. The header 

'^In general, an upper bound is obtained by setting K;r=l. 



of the packets transmitted by the primary user contains their 
sequence number Therefore, by decoding the header the 
secondary user can count retransmissions of the same packet. 

By decoding packet header and ACK/NACK feedback sent 
by the primary and secondary receivers, the secondary user 
can estimate the transition probability matrix and the cost 
functions, as well as identify the state of the primary user. 
Note that, as the decoding of packet headers and ACK/NACK 
is crucial to establish communications and instrumental for 
distributed access mechanisms, these packets are generally 
strongly encoded and available to all the neighbors of a node. 

If the statistics of the Markov chain and the cost func- 
tions are unknown, under the assumption of idealized state 
observation, reinforcement learning algorithms [21 1 can be 
employed to iteratively converge to the optimal strategy based 
on a sample path of observations. The convergence rate of 
learning algorithms decreases as the state space gets larger. 
However, techniques which approximate the learned functions 
may speed up the learning rate f22]. 

In more complex network scenarios the exact identification 
of the state of the network, as well as the estimation of the 
statistics of the stochastic process which models its temporal 
evolution, might be very challenging. As observed in some 
recent work which extends the framework presented herein to 
online learning f23l, 1*241, the secondary user may get access 
to only some features of the state space. For instance, if the 
primary user stores packets in a buffer the number of packets 
in the buffer is hidden to the secondary user If the secondary 
user fails to decode the header of a primary user's packet it 
may detect the presence of a signal but the retransmission 
index remains unknown. Another example of hidden state 
variable is the channel state of the primary links. Channel 
knowledge would increase the effectiveness of secondary 
user transmission. In fact, the secondary user can potentially 
reduce the impact of the generated interference by scheduling 
transmissions in those time slots in which the link between the 
primary transmitter and the primary receiver is very strong, 
and thus interference would not impair packet reception, or 
very weak, and thus the primary user packet would fail in 
any case. In the absence of channel state information, the 
secondary user bases its decision making on the average effect 
of actions over channel states, that is, the failure probability 
associated with idleness and transmission. Analogously, if a 
backoff mechanism is implemented by the primary users to 
regulate channel access the secondary user may be unable to 
distinguish between idleness due to empty buffer or backoff. 

In general, by observing the operations of the nodes it is 
possible to acquire a significant amount of information about 
the state of the network. The amount of information collectible 
by the secondary user depends on the transmission, access and 
networking protocols. For instance, the rigid access structure 
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Figure 7. Throughput as a function of the maximum fraction of tliroughput 
loss, where a=0.8, p=0.3, i/=i'*=0 and A=0.3. 

provided by Time Division Multiple Access (TDMA) provides 
more information to the observer than random access. In fact, 
an idle TDMA slot means that the assigned user has an empty 
buffer, whereas idleness in random access may be related to 
the access mechanism itself. 

If the statistics of the process and the state-observation map 
are known to the secondary user, then the secondary user can 
base its decision on a belief vector li25J collecting the max- 
imum likelihood distribution of the real state of the system. 
Since a priori knowledge of statistics and state-observation 
map is unrealistic in general scenarios, the approach proposed 
in ll24l is to optimize the distribution of the states in the 
observation space based on the estimated cost functions, which 
collects all the possible observations. 

According to this discussions, the framework presented in 
this paper opens many exciting new areas of investigation. 

VII. Numerical Results 

In this Section, numerical results validating the findings 
and observations made throughout the paper are presented. 
We recall that a is the probability that the primary source 
transmits a fresh packet in a slot not allocated to packet 
retransmission; p is the probability that the primary receiver 
correctly decodes a packet sent by the primary source in a slot 
in which the secondary source is silent. The failure probability 
at the primary receiver if the secondary source transmits is 
p*=p+X{l—p), where A is the failure probability increase, h' 
and h'* are the failure probability at the secondary receiver 
if the primary source is silent and transmits, respectively. A 
failure probability increase is also defined for the secondary 
receiver by Ag such that i'*=iy+Xs{l—i^)- 

In Sections IVII-AI and IVII-BI we present numerical re- 
sults for i.e., the Z-channel, where the constraint is 
defined on throughput and failure probability, respectively. 
Section fVII-CI presents numerical results for the case i'*>i'. 

In all the following plots, the maximum number of trans- 
missions of a primary source's packet is fixed to r=4. 

A. Constraint on the primary source's throughput, v*~v 

In this Section, numerical results for the Z-channel network 
with a constraint on the throughput loss of the primary source 
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Figure 8. Transmission probabilities as a function of the maximum fraction 
of throughput loss, where a=0.8, p=0.3, u=u*=0 and A=0.3. 

are presented. The performance loss is parameterized through 
e, defined as the maximum fraction of throughput loss of 
the primary source, i.e., the maximum throughput loss is 
C7-Wp(0)e-(1-Jp(^))e. 

In Figs. I?] and [8] the throughput and the secondary source's 
transmission probability are depicted as a function of e. In the 
picture, Wpmax and Wpmin correspond to the throughput 
achieved by the primary source when the secondary source 
is silent and the minimum throughput of the primary source 
according to the constraint^ 

The throughput of the secondary source increases as e is 
increased. A larger e allows the secondary source to inter- 
fere more with the primary source. The throughput actually 
achieved by the primary source decreases according to the 
increased maximum performance loss allowed, and it can be 
observed that the policy of the secondary source lowers the 
throughput of the primary one as much as possible in order 
to maximize the secondary throughput. When the throughput 
of the secondary source is equal to one, corresponding to 
the former transmitting with probability one in every slot, 
the throughput of the primary source stops decreasing, as the 
secondary source cannot interfere more. 

Fig lDshows that the policy of the secondary source follows 
the structure discussed before. Thus, with e=0 the secondary 
source is allowed to transmit only in the slots where the 
primary is not accessing the channel (ko=1 and Kt—0, t>0). 
As e increases, the transmission probability in state 1, i.e., ki, 
increases until it reaches unity. Then, K2 starts to increase and 
so on until all the Kt's are set to unity. 

The rate increase of the various k^'s is different. In particu- 
lar, the rate increase of the Kt's corresponding to transmission 
in states with small indices is smaller than those corresponding 
to large indices. In fact, interference in the states correspond- 
ing to the first transmissions of a packet generates a larger 
primary source's throughput reduction than interference in the 
later transmissions. Conversely, the throughput of the primary 
source gets larger, and so does the maximum throughput loss. 

Figs. |9] and [TO] show the same quantities as a function 
of a, i.e., the arrival rate of new packets at the primary 

'''in this and in the following Section, the throughput of the secondary 
source is normaHzed to (l—u). 
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Figure 9. Throughput as a function of the arrival rate a. where e=0.1, 
p=0.3, i/=i^*=0 and A=0.3. 

source. As expected, the throughput of the secondary source 
decreases as a increases. Fig [TT| depicts the average number of 
transmissions of the primary packets for the same parameters, 
with A=0.3 and A=0.9. 

A larger a means that the primary source is accessing the 
channel more often. Therefore, the number of slots in which 
the secondary source can transmit while meeting the constraint 
on the throughput loss of the primary source decreases. 
However, there is another effect of a large a that needs to be 
considered besides the scarcity of empty slots (in which the 
secondary source transmits with probability one). In fact, if 
the probability that a fresh packet is transmitted in an idle slot 
by the primary source is small, an increased average number 
of transmissions for each packet has a smaller effect on the 
throughput of the primary source. The additional retransmis- 
sions forced by the interference are likely to substitute for 
slots in which the primary source would be idle anyway, and 
thus, are the slots in which the primary source would incur 
the highest possible cost. On the other hand, if a is large, 
additional retransmissions are performed instead of new packet 
transmissions that collect an average cost smaller than that of 
an empty slot. 

The relation between a and the interference generated by 
the secondary source to the primary receiver is illustrated in 
Fig. [TOl and Fig. [TT] Fig. [TO] shows that the throughput trend 
of Fig. |9] does not only correspond to a smaller fraction of 
empty slots, but that the policy in the states 9>0 is a function 
of the arrival rate. The fraction of slots in which the secondary 
source superposes its activity with that of the primary source, 
normalized by the fraction of slots in which the latter source 
transmits, decreases as a increases. The explanation for this 
behavior is illustrated above. If a is small, the primary source 
is often idle, and the retransmissions induced by interference 
generate a smaller loss in the throughput of the primary source. 
In fact, if a is small, the secondary source is allowed to force 
more retransmissions (see Fig. [TTT l. Note that again the policy 
follows the structure discussed in the previous section, where 
the /tt's sequentially turn off as the secondary source is forced 
to reduce the interference. 

Finally, Fig. [T2] shows {Js{K,^f)-Jsidd) / -^sidd a func- 
tion of the arrival rate a, where Jsidihf) is the optimal cost 
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Figure 10. Transmission probabilities as a function of the arrival rate a, 
where e=0.1, p=0.3, i/=i'*=0 and A=0.3. 
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Figure 11. Average number of transmissions as a function of a, where 
e=0.1, p=0.3, u=u*=0, A=0.3. 

for the horizontal flooding approach^ The optimal trans- 
mission probabilities for the horizontal flooding approach are 
numerically found via a LP slightly more involved than that 
discussed herein. Thus, the curves represent the fraction of cost 
increase when horizontal flooding is adopted instead of vertical 
flooding. When a is sufficiently small, the cost increase is 
zero, as both approaches transmit in all states with probability 
one. As a increases, the secondary source is forced to reduce 
the fraction of time in which it transmits in both vertical 
and horizontal flooding. In the former case, the secondary 
source starts decreasing the transmission probability in state 
T, while in the latter, the transmission probability is reduced 
in all states t>0. However, as soon as J_ becomes inadmissible, 
in order to meet the constraint on the maximum throughput 
loss, the horizontal approach is forced to reduce the average 
transmission time of the secondary source much more quickly 
than the vertical approach. Then, as the arrival rate a is further 
increased, the cost increase diminishes, since the advantage 
due to the concentration of the interference in states with 
smaller indices vanishes. In fact, vertical flooding improves 
the delivery probability of a packet at the expense of a larger 
average number of transmissions. 

"if .^s(^hf)=»7s(^)=0 the ratio is assumed to be equal to zero. 
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Figure 12. Cost increase as a function of a, where e=0.1, p=0. 3, !^=i^*=0. Figure 14. Transmission probabilities as a function of the maximum 
A=0.3. performance loss e, where a=0.8. p=0.3, i^=!/*=0 and A=0.1. 
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Figure 13. Throughput of the secondary source as a function of the maximum Figure 15. Packet failure probabiUty as a function of the maximum 
performance loss e, where a=0.8. p=0.3, i/=iy*=0 and A=0.1. performance loss e, where o=0.8. p=0.3, u=u*=0 and A=0.1. 



B. Constraint on the primary source's failure probability, 

In this Section, results for the optimization problem with 
a constraint defined on the failure probability of the primary 
source's packets are presented^ The activity of the secondary 
source increases the failure probability, and the maximum fail- 
ure probability increase allowed is given by a. This increase, 
cr, is again parameterized through e, defined as the maximum 
relative failure probability increase, that is, cr~J^g'{0_){l+f)- 

Figs. [T3] [14] and [15] show the throughput of the secondary 
source, the secondary source transmission probability, and 
the failure probability as a function of e, respectively. In 
Fig. [15] jTmin is the failure probability associated with an 
idle secondary source, and jTlnax is the maximum failure 
probability according to the constraint. 

Intuitively, the throughput, as well as the overall fraction of 
slots in which the secondary source transmits, increase as the 
maximum failure probability of the primary source's packets 
increases. The transmission strategy of the secondary source 
follows the structure discussed throughout the paper. As the 
constraint becomes less stringent, first transmission in state 
is increased, then transmission in state 1 and so on, until the 

'*We remark that by failure probabiUty of a packet, we refer to the 
probability that all the T transmissions fail. 



secondary source transmits with probability one in all states. 

Fi gs . [T6l [TT] and [Tslprovide the same metrics of the previous 
figures as a function of the failure probability of the primary 
source's transmissions p. Note that the failure probability of 
the packets of the primary source if the secondary source is 
always idle is . 

The throughput of the secondary source, as well as the 
transmission probabilities in the states 9>0, increase as p 
becomes larger. We observe the following: 

• the maximum a=p^ (l+e) polynomially increases with 
p. This means that the constraint becomes less stringent 
as p increases (see Fig. [TSb : 

• the primary source transmits in a larger fraction of 
slots as p increases, due to a larger average number of 
retransmissions. The secondary source has fewer empty 
slots in which to transmit without interfering with the 
primary source. 

If p is small, and the secondary source keeps idle, the 
primary source transmits in a fraction of slots close to a. As 
p gets larger, the primary source increases the fraction of slots 
in which it transmits, because of the retransmissions. Never- 
theless, the constraint becomes less stringent as p increases. In 
fact, the maximum allowed failure probability is <j=p'^ (l+e). 
Therefore, as p increases the secondary source can increase its 
activity in states 9>0. The tradeoff between those two effects 
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Figure 16. Throughput of the secondary source as a function of the failure 
probability of the primary source in the absence of interference p, where 
a=0.8. p=0.3, u=u*=0 and A=0.1. 



Figure 17. Transmission probabilities as a function of the failure probability 
of the primary source in the absence of interference p, where 0=0.8. p=0.3, 
u=u*=0 and A=0.1. 



determines the optimal throughput achieved by the secondary 
source. For the considered set of parameters, the increase of 
a wins over the decrease of the number of empty slots. 

C. Case v*>v 

In this Section, illustrative results for the general case v*>v 
are shown. As discussed in Section[Vl in this case, the structure 
of the optimal policy depends on the parameters. In fact, due 
to the effect of the interference by the primary source at the 
secondary receiver, the secondary source may be forced to 
be silent in states 0>O in order not to decrease the steady- 
state probability of the empty-slot state 0. In the following, 
the constraint is defined on the throughput loss of the primary 
source. 

The throughput and the transmission probabilities as a 
function of As are depicted in Figs. [T9l and l20l We recall that 
A5G[0, 1] determines how decoding at the secondary receiver 
is hampered by primary source's transmissions. The values 
As=0 and Ag^l correspond to v*=v and v*—l, respectively. 

As a first observation, the throughput of the secondary 
source decreases as A5 increases. In fact, the effect of interfer- 
ence both decreases the reward associated with transmission 
in the states 9>{) and forces the secondary source to reduce 
its overall activity. For the same reason, the throughput of the 
primary source increases and moves close to the maximum 
throughput, that is, the secondary source rarely interferes with 
the primary source. 

In fact, as v* gets closer to one, the secondary source 
reduces transmission, and interference, in states 0<9<T. This 
is done in order to avoid retransmissions, which would reduce 
the availability of white space. Interference in state T does 
not induce a higher probability of further retransmissions. 
Therefore, kt remains one as long as it is admissible according 
to the constraint. Note that ki, i.e., the transmission probability 
in the state which has the largest impact on the average 
number of retransmissions, is set to 0. For this configuration 
of parameters, the policy takes the opposite form with respect 
to that described for the case i^*=i', i.e., the secondary source 
concentrates transmissions in the last states of the chain, in 
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Figure 18. Packet failure probability as a function of the failure probability 
of the primary source in the absence of interference p, where o=0.8. p=0.3, 
u=u*=0 and A=0.1. 



order to have a smaller impact on the number of transmissions 
of the primary source's packets. 



VIII. Conclusions 

In contrast to much prior work on cognitive networks, in this 
paper we investigated a scenario wherein the secondary source 
is allowed to superpose its transmissions over those of the pri- 
mary source. The secondary source aims to maximize its own 
throughput, while guaranteeing a bounded performance loss 
for the primary source. We derived the optimal transmission 
policy for the secondary user when the primary user adopts 
a retransmission based error control scheme. If the decoding 
probability at the secondary receiver is not increased by the 
primary source's transmissions, the resulting optimal strategy 
of the secondary user has a unique structure. In particular, 
the optimal throughput is achieved by the secondary user by 
concentrating its interference to the primary user in the first 
transmissions of a packet. This is a first step toward a better 
understanding of interference control strategies in dynamic 
wireless networks. 
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Figure 19. Throughput as a function of Xs, where a=0.5, A=0.6, 
p=,^=0.2 and e=0.05. 



Figure 20. Transmission probabilities as a function of Xg, where a=0.5, 
A=0.6, p=i/=0.2 and e=0.05. 



Appendix A 
Proof of Theorem[T| 

Proof: Theorem [T] states that if any component Kg of 
the vector k is increased, with 9>0, then Jp{k) increases. 
This corresponds to the intuitive fact that a larger transmission 
probability of the secondary source in any of the states 
in which the primary source transmits results in a smaller 
throughput achieved by the latter. 

In order to prove this result, we show that dJp{K)/dKe>0, 

\/e > 0. 

Let us introduce the following notation 

T t 

AOp in) = il-a)+a l[{p+{'^-p)Xli^) (46) 

t=l i=l 
T-1 t 

V{k) = 1+a l[{p+{'^-p)X^^)■ (47) 

t=l 4=1 

The cost is then Jp{k)=N'jp{k) /'D{k). In the following, the 
obvious dependence of the above functions on k is dropped 
from the notation. 

The derivative of the cost of the primary source can be 
obtained through the well-known formula 



^^f. 



In the previous equation, the denominator is always positive, 
and thus we focus on the numerator. We have 



dKg 

dV 
dng 



'Y.^l-P)>^ n {P+{^-p)^^^) (49) 

t=e i=i,i^e 



T-l 



= a^(l-p)A n (p+(1-p)AajO- (50) 

t=e i=iA=ie 



Through simple algebraic manipulation, we obtain the expres- 
sion in Eq. ( BTI ). Since 



1- 



1[{p+{i~p)Xk,) > 0, 



(52) 



then all the terms in Eq. ( BTl i are strictly positive^ Therefore, 
the derivative is strictly positive and the cost is a monotonically 
increasing function of any element Kg with 9>Q ■ 

Appendix B 
Proof of Theorem[2] 

Proof: Theorem |2] states that if the secondary source 
transmits with a higher probability in any state 9, i.e.. Kg 
is increased, then the average throughput achieved by the 
secondary source increases. This may appear a trivial con- 
sideration. However, it must be observed that the transmission 
probabilities Kg, Q<9<T influence the steady-state distribution 
of the Markov chain of the network and thus influence the 
average throughput of the secondary user. 

A larger Kg results in a larger probability that the primary 
source fails the 6'-th transmission of a packet, and, thus, a 
larger probability that the Markov process moves to states 
(48) 9+1, . . . ,T. As a consequence, the steady-state probabilities 
of the latter states increase, while those of states 0, . . . ,9 de- 
crease. Intuition suggests that, in some cases, a larger Kg may 
result in a smaller overall average transmission probability 
of the secondary source, i.e., J^t^n.i^)'^*- Theorem |2] instead 
ensures that a larger value of any of the Kg always results in 
a larger J^t '^d^)i^t- 



'^Note that in the degenerate cases a=0, A=0, or p=l Eq. (sT) is equal 
to zero. 



dKg dKg ^ 



a{l~p)X Yl {p+{l-p)XK,)\iv + l 

i-«'n(p+(i-p)A«;,))(5](i-p)A n (p+(i-p)A^ 



(51) 



The proof of this theorem is analogous to that of Theorem[T] 
In particular, we show in the following that (9>Vs(/«)/9k9>0, 

V(?e{o,...,r}. 

Recalling the definition of given in the previous 

Theorem, we write Ws(k)=A/'vi/s (^)/^(^;)^ where 

T t-1 

J^Wsill) = (l-a)+a^Kt]J(/7+(l-p)AKi). (53) 

t=l i=l 

In the following, we drop in the notation the dependence 
between these functions and the policy. The derivative of the 
numerator is 

9-1 



dJVws 
dKe 



T 



+ a J2 (1-P)^«t n (54) 
t=e+i i=i,i^e 

Again, the derivative can be written as 



dWs/dne-- 



dng 



(55) 



We first show the following Lemma: 
Lemma 1: Consider Nws ™d 2? as previously defined, the 
following can be shown: 



dMws/ dK0>d'D/ dug, 



(56) 



with 0<e<T. 

Proof: If e=T, we have 



T-l 



dAfws/dKT-dV/dKT=a Y[ ip+il-p)XK,) (57) 



that is clearly positive. 

Assume 6<T. The expressions 



t-i 



1+(1~p)Ak9+i+ J2 n (58) 



and 



t=e+2 i=e+i 
T-l t 



(i-p)a[i+ J2 n (p+(i-p)A«o): 



(59) 



t=e+i i=e+i 



are dAfws/dKe and dV/dug divided by 
a nf Ji {p+{l~p)\i^i), respectively. 

Eqs. ( |58] l and (|59] l can be reorganized as 



l+(l-p)A p*-'-^Kt+C^. 
t=e+i 



(60) 



and 



(l-p*-''-i)A+(l-p)A2 ^ {p'-'-'~p'-')n,+C,. (61) 

respectively, where the constants Ci and C2 account for all 
the cross-terms involving the multiplications of two or more 
variables k^. We do not provide here the expressions for Ci 
and C2, as they are conceptually simple, but tedious. However, 
it is possible to show that Ci>C2. 



Since l>(l-p*-^-i)A and 

{l~p)\j2 p'~'-'nMl~p)^'Y. {p'-'-'-p'-')^t. 
t=e+i t=e+i 

(62) 

then dN'ws/dKg>d'D/dKg. ■ 
As for the proof of TheoremlT] since (2?)^>0, we focus on 
?^V-^Mws- Due to Lemmam ^>|^. Moreover, 

one One ""^ ■— ' dug OKg ' 

'D>Afws ■ In f^ct. 



Therefore, 



>Vs(^)^^<l. 



^-P > —V > —Afws, 



dng 



dng dug 



(63) 



(64) 



and ^2^-^A/-M.s>0. 



Appendix C 
Proof of Theorem[3] 

Proof: Theorem [3] states that starting from a policy k, for 
any pair of indices j and r with Q<j<r such that Kj=Kr and 
Kf=0, r<t<T, if 



then 



Jp(K')=^p(i£"), 



Ws(«:')>Ws(^"), 



(65) 



(66) 



with k'=k+u,S'^ and k"=k+u„6'' 0<(5'<1— k,- and 



— — —J J 



o<(5;'<i-K^ 

In words, if the policy obtained by increasing the j-th 
element of k by 5j and the policy obtained by increasing the r- 
th element of k by S" incur the same primary source's average 
cost, then the former policy achieves a larger secondary 
source's average reward. 

We briefly recall the notation introduced in the previous 
proofs. The average primary source's cost and secondary 
source's reward can be written respectively as 



Jp(^) 



(67) 
(68) 



where 



T t 



MjA^) = {l-a)+aY,YliP+i^~P)^^^^)>0 (69) 

t=i i=i 

T-l t 

= 1+a J2 Y[{p+{^-p)\'ii)>0, (70) 

t=l i=l 

T t-l 

A/Ws(k) = (l-a)+a^Kt J|(p+(1-p)Ak,)>0 (71) 



t=l i=l 



In the following, in order to simplify the notation, we drop 
the subscripts P and S and we refer to the primary source's 
cost and secondary source's reward when talking of cost and 
reward, respectively. 



^'^ " V{k) + AV{j, , k) + AV{r, , k) ~ AV{r, S'^.n) ~ V+{A + C) 6'^ 

J yH+Ur r) v{k) + A2?(r, 5'; ,k) V + A 5'^ 

_ J^w iti) + AA/W (j, (5;- , At) + AA/W (r, S'^ , k) - AMw (r, 5'^ ,k) _ Afwi^) + (G + F) S'^ 
yV{5.+u,d^ } - ^^^^ ^ ^^^^^ ^, ^ ^ ^^^^^ ^, ^ _ ^^^^^ - 2? + (A + C) Sr 

Mwin) + AMw{r, 6'^,k) _ WW + G (5" 



W{k+uX) 



V + A5': 



(84) 
(85) 
(86) 
(87) 



If the q-th element of k, with q>0, is increased by S, with 
(5<1— Kg, the average cost and reward can be written as 



J{k+u„5) = 



Afjiti)+AJ\fj{q,S, k) 



V{K)+AV{q,5,K) ' 
, A/W(^)+AA/W(g,<5,K) 
P(K)+A2?((7,d, k) 



where 

AA0(g,^,^) = ,5 

AX>(q,^,K) = (5 
A7VW(g, <5, k) = 5 



t=q 1=1, J 
T-1 t 

. 9-1 



(72) 
(73) 

,(74) 
,(75) 



a (l-p)A ^ n 



(76) 



Thus, AJ\f,j{q,S,K), AJ\fw{q,S,K) and AT>{q,S,K) are 
hnear functions of 5, and represent the increment of the 
numerator of the cost and reward, and of their denominator, 
corresponding to an increase S of Kg. Note that if 6 is 
strictly positive and q>0^ then AJ\fj{q,S, k), AV{q,5,K) 
and AA/W(<Z, (5, k) strictly positive. 

According to the hypothesis of the theorem 
J{K+Uj5'j)^J{!S.-\-Uy.K)- This equality can be rewritten as 



Nj{k)+AMj{j, Sr,K) M .,{!£)+ AM. ,{r, 5';,k) 



V{K)+AV{j,S',K) 



V{K)+AV{r,5';,K) 



(77) 



The increases of the numerators and denominators can be 
rewritten as 



AV{r, 6,k) 
AJ\fj{r, S,k) 
ANw{r, S,k) 



SA, 
SB, 
5G. 



(78) 
(79) 
(80) 



Note that, since K,j=Kr, the difference between the increase 
of the denominator when Kj or are increased by 5 is a 

'^Together with the assumptions A, a, 1— p>0. 



constant C equal to 

AV{j,S,K)-AV{r,S,K) = 

r-l t 
t=j i=l,i^j 

= SC>0. (81) 

Analogously, the difference between the numerators of the 
cost and reward increases are 

AAAjO; 6,k) - AMj{r,d,K)^AV{j, 5,K)~AV{r,6,n) 



and 



:(5C, 



A Mw ( j, (5, k) - AMw {r,5,K) = 



(82) 



— 6a 



\{{p+{l-p)\n,)-\{{p+{l-p)\K, 



^ 4=1 



t-1 



+ 5a{l-p)\ ^ ^^t W {p+{1~p)\k{) 



(83) 



respectively. 

According to Eqs. (l78Tl-(l83Tl. and omitting the depen- 
dency of the quantities on k, we rewrite J{K+UjSj), 
W(k+u,.(5') and W(k+u^J") as shown in 
Eqs. dMll, (Ell, diSll and respectively. 

Note that 



Mj+{B+C) S Mj+B 6 



>- 



V+{A+C) 5 V+A d 



-=J{K+UrS) (88) 



for any (5, with 0<(5^niin(l — Kj, 1 — k^-^. 

Choose (5", with 0<6'^<l~Kr, and denote the cost of 
policy K+u.^ with Z=J'(k+u,.(5"). Observe that, due to the 
monotonicity of the cost function, then J'{k)<Z<1. 

Since the cost function is continuous with respect to any 
element of the policy vector and kj^Kj- by assumption, then 
there always exists S'j such that J{K+UjSj)—J{K+Uj.6'^)—Z, 
with 0<(5^<(5"<1-Kj.=l-Kj. 

The values for Sj and 5" can be readily found to be 

V Z -J\fj 



^^B + C-{A + C)Z 



V Z-Afj 
B-AZ 



(89) 
(90) 



{VZ - N.j){V{BF~CG)+{CG-AF)N.j+{A-B)CMw) 

{BV-AN.MB + C)V-{A + C)Mj) ' ^ ' 



In order to complete the proof, the following inequality so that the second summation corresponds to the summation in 
needs to be proved: S@ {V-Afj)F B+{B-A)AfjF in Eq. dH can be simplified 



as shown in Eq. ( 1103b . 

Analogously, Afw can be rewritten as 



> (91) 



V+{A+C) 5'^ V+A5'; Nw = {l-a)+aY^n,\{{p+{l-p)Xn,) + 
By substituting Eq. (|89]l and Eq. (|90]l in Eq. (|9B, we obtain t=i t=i 

Eq. (gill. Note that Ji^ til 

VZ-Nj nt\\{p+{l- p)\k,), (104) 



>0. (93) t=r+l »=1 



{BV-ANj){{B + C)V-{A + C)Nj) , 

so that one summation corresponds to the summation in G. 
In fact, recalling that by hypothesis, we have -{V-Nj)G C-{B - A)AfwC in Eq. m can be 



/ ^ \ rewritten as reported in Eq. (1105b . 

B - A^a(l-p)Xi Yl ) Eq. ^ is the sum of Eq. (fT03T l and Eq. (fTOSl l. By 

i=i,i^r ^ hypothesis =0, Vt>r, and thus G=0. The term -a G C in 

/ ^ \ Eq. (II 05b is then equal to zero. Eq. ( |106b reorganizes the sum 

= a(l-p)A Jl (p+(1-p)A)k, >0 (94) of Eqs. (Hm) and (HoS. 



The first term of Eq. ( 1106b 



and since C > A/^/rj then , t 



{BV-AMj){{B + C)V-{A + C)Mj)>Q. (95) a f[B-X a{l ~ p)\ ^ {p+{l - p))^^^) ) (107) 
Moreover, due to Theorem [T] we have for (5' >0 



Z^J{k+u..5'A>J{k)=^. (96) 



is positive. In fact, X<1 and 

T 



Therefore, 



^' B-a(l-p)A J] (p+(1-p)Ak,) 



2?Z-A/'.7>0. (97) T-i t 

= a{l-p)\Y^ n - > (108) 

The Theorem is then proved if the following inequality 
holds: Moreover, it can be shown that F>C. The proof is anal- 

ogous to that of Lemma [T] and is not reported herein. As a 



V{BF - GG)+(GG-Ai^)A/'j+(.4-B)GM 



w) 



consequence, the second term of Eq. ( 1 106b is positive. All the 



{V^Afj ){F B~C G) + {B~A){Nj F~Nw C)>0.(98) other terms ai-e tiivially positive. 
£)gfijjg The inequality is then proved, as well as the Theorem. 



X = a\[{p+{l- p)XHi,)>Q. (99) 

i=l 

Then, 



Appendix D 
Proof of Theorem|4] 

Proof: Theorem Instates that starting from a policy k, for 
/ J-r \ any pair of indices i and r, with 0<7<r, such that Ki=Kr 

V-Mj=a(l-l[{p+{l-p)X.,)]=a~X>0, (100) and «,=0, r<t<T, if 

and Jp(^l')-^p(«"), (109) 



B-A^ ^ X>0. (101) then 

p + (1 - pJAkj 



Ws(ii')<Ws(K"), (110) 



In the second term of Eq. (|98] l. Mj can be split into two 

terms with k'=k— u^-J^ and k"=k— 0<(5j<Kj and Q<(5"<«;j.. 

r-i t The proof is similar to that provided in the previous Appendix. 

Mj = {l-a)+a'^Y\_iP + (1 ~ P)^^i) + done in the previous proof, we fix (5", with 0<(5"<1-k,., 

t=i i=i and we denote the cost associated with the policy ob- 

T-i t tained by decreasing by S'^ with Z=J{k—uX). Note 

+ a ^ WiP + (1 ^ p)^Ki), (102) that 0<Z<Jp{k). Through considerations entirely analogous 



t—r 2—1 



-"in B, the summation has the additional term coiTesponding to t=T and 
'Because ^^^=J'(k)<1- the products do not have the term i=j. 



{V-Nj)F B+{B - A)M.jF ^aF B - X F B + ^^^^ J\fj 



P + (1 - p)XKj 

aFB + il-a) ^l—fl^XF-XFa{l-p)XXF TT (p+(1-p)AkO 

p+ (1 - p)Xkj . . 

.7-1 i 

+Xi^a(l-p)A^ [| (p + (1 - p)Ak,) + X C. (103) 

t=l i=l,i5^J 



-(P-A/'j)G C-(B - A)N'wC = -aC G+X C G- ^^^^ X C A/W 



-a C G- (1-a) 



p + (1 - p)Akj 
(l-p)A 



p + (1 — p)Akj 



^X C a{l- p)XY,^^t n (p+(1-p)Akz) 

t=l i=l,i^i 



^-aC G- 



(l-p)A 



p + (1 - p)Ak^ 



-X G + X Ga(l -p)A^Kf J| (p+(l-p)Afc,) 



t=l i=l,i^j 



-X C F + X G aW{p+{l- p)Xk,). 



(105) 



2)(Bi^-GG) + (GG-AF)A(7+(A-B)GA%' = a i^(B-X a(l -p)A ]J (p+(l - p)Ak,) 



(1-a) 



(1 - P)A 
p+ (1 -p)Akj 



i-i t 



X {F-G)+X F ail^p)Xj2 H (P + (1 - P)^^.) 



t=l 

1- 1 i t-i 

X G aJ|(p+(l-p)AK,)+X G a(l-p)A^Kt J| (p+(1-p)Ak,; 

2— 1 i— 1 i—l.i^j 



(106) 



to those provided in Appendix |C] it can be shown that 
there exists S'j such that Jp{K—Uj6j)=Jp{K—Uj.6',!)=Z, with 
0<(5j<(5"<Kr='«j- The corresponding values of 6j and 5" can 
be readily found to be 

Afj -V Z 



^~B + C-{A + C)Z 



6f. = 



Nj-V Z 



(111) 



(112) 



B- A Z 

Observe that the above values of 5'^ and (5" are the opposites 
of those in Eqs. ^ and 

By substituting the above equations in 



Nw-{G+F) 5'. Mw-G 6'/. 



>Vs(kVWs(k'' 

the same fraction as in Eq. 
Eqs. ^ and (|98]l, 

{V{BF-GG)+{GG-AF)Afj+{A-B)GJVw 



(113) 



V-{A+C) S'j V-A 5'; ' 

is obtained, and according to 

>0. (114) 



{BV - ANj){{B + C)V -{A + C)Nj) 

Differently from the proof of Theorem |3] since 

Z<Jv{n)=M.]/V, then 

VZ-Nj<Q. (115) 



Therefore, 

Ws(k')-Ws(k") <0, 
and the theorem is proved. 



(116) 



Appendix E 

Proof of Theorems[3]and|4]for Constraint on the 
Failure Probability 



Proof: As discussed in Section IIV-BI the cost in- 
crease/decrease induced by an increase/decrease of the trans- 
mission probability in state 9>Q does not depend on 0. 
In this case, a transmission probability increased by 
in state j or r results in the same increase of 
the average cost. More formally, fix i,r and 5, with 
0<j<r<T, — min(Kj, Kr)<(5< min(l— Kj, 1— and 5^0, 
then Jp^{k+UjS)^jI^{k+u^S). 

Therefore, starting from a policy k, and defining the policies 



K/=K+UjSj and k"=k+Uj.S'^, then 



(117) 



^'A decreased transmission probability corresponds to a negative S in the 
following. 



with 0<Z<1, Z^Jp^in), only if S'j=S';^S{Z), where S{Z) is 
a function of Z. Note that 5{Z)>0 if Z>jP'{k), and (5(Z)<0 
otherwise. 

According to the notation introduced in Appendix O the 
difference between the rewards achieved with poHcies k' and 
k" is 



, (118) 



Ws(Ai')-Ws(^") - 

_ J\fw + {G+F) S{Z) Nw+G 5{Z) 
V+{A+C) S{Z) V+A 5{Z) 
which is larger than zero if the following holds: 

6{Z) (^F {V + A5{Z)) - C{Uw + G5{Z))^ > 0. (119) 

Since F>C, as previously stated, and 

= Mi^ + -rSiZ)) = ^^^^slz)^ < 1 (120) 

by construction, then, 

F {V + AS{Z))-C{Afw + GS{Z))>0. (121) 

Finally, if Z>jP'{k), then 5{Z)>0 and >V(K')>>V(ii")- If 
Z<JpP{K), then 6{Z)<0 and >V(k')<W(k"). ■ 



[15] K. Eswaran, M. Gastpar, and K. Ramchandran, "Bits through arqs: 
Spectrum sharing with a primary packet system," in IEEE International 
Symposium on Information Theory, 2007., June 2007, pp. 2171 -2175. 

[16] R. Zhang, "On active learning and supervised transmission of spectrum 
sharing based cognitive radios by exploiting hidden primary radio 
feedback," IEEE Transactions on Communications, vol. 58, no. 10, pp. 
2960 -2970, Oct. 2010. 

[17] T. Cover and J. Thomas, Elements of Information Theory. New York: 
Wiley, 1991. 

[18] D. R Bertsekas, Dynamic Programming and Optimal Control, 2nd ed. 

Belmont, MA: Athena Scientific, 2001, vol. 2. 
[19] K. W. Ross, "Randomized and past-dependent policies for Markov 

decision processes with multiple constraints," Operations Research, 

vol. 37, no. 3, pp. 474^77, May-June 1989. 
[20] S. P. Meyn and R. L. Tweedie, Markov chains and stochastic stalnlity. 

London: Springer- Verlag, 1993. 
[21] S. Mahadevan. "Average reward reinforcement learning: Foundations, 

algorithms, and empirical results," Machine Learning, vol. 22, no. 1, 

pp. 159-195, 1996. 
[22] F. Fu and M. Van Der Schaar, "Structure-Aware Stochastic Control for 

Transmission Scheduling," Arxiv preprint arXiv: 1003.247 1. 2010. 
[23] S. Firouzabadi, M. Levorato, D. ONeill, and A. Goldsmith, "Learning 

interference strategies in cognitive arq networks," in IEEE GIoImI 

Telecommunications Conference, IEEE GLOBECOM 2010, Nov. 2010, 

pp. 1-6. 

[24] M. Levorato, S. Firouzabadi, and A. Goldsmith, "Cognitive interference 
networks with partial and noisy observations: a learning framework," in 
accepted for presentation at IEEE Global Telecommunications Confer- 
ence, IEEE GLOBECOM 201 1. 

[25] L. Kaelbling, M. Littman, and A. Cassandra, "Planning and acting in 
partially observable stochastic Aomam'^r Artificial Intelligence, vol. 101, 
no. 1-2, pp. 99-134, 1998. 



References 

[1] Q. Zhao and L. Tong and A. Swami and Y. Chen, "Decentralized 
cognitive MAC for opportunistic spectrum access in ad hoc networks: 
a POMPD framework," IEEE J. Select. Areas Commun., vol. 25, no. 3, 
pp. 589-600, Apr. 2007. 

[2] O. Simeone and Y. Bar-Ness and U. Spagnolini, "Stable throughput 
of cognitive radios with and without relaying capability," IEEE Trans. 
Wireless Commun., vol. 55, no. 12, pp. 2351-2360, Dec. 2007. 

[3] S. Geirhofer and L. Tong and B. M. Sadler, "Dynamic Spectrum 
access in the time domain: modeling and exploiting white space," IEEE 
Commun. Mag., vol. 45, no. 5, pp. 66-87, May 2007. 

[4] R. Urgaonkar and M. J. Neely, "Opportunistic scheduhng with rehability 
guarantees in cognitive radio networks," in Proc. of the 27th IEEE 
Conference on Computer Communications (IEEE INFOCOM), Phoenix, 
AZ, USA, Apr. 2008, pp. 1301-1309. 

[5] W. Zhang and U. Mitra, "A spectrum-shaping perspective on cognitive 
radio: uncoded primary transmission case," in Proc. of IEEE ISIT, 
Toronto, Ontario, Canada, July 2008. 

[6] L. Cao and H. Zheng, "Distributed rule-regulated spectrum sharing," 
IEEE J. Select Areas Commun., vol. 26, no. 1, pp. 130-145, Jan. 2008. 

[7] H. Su and X. Zhang, "Cross-layer based opportunistic MAC protocols 
for QoS provisioning over cognitive radio wireless networks," IEEE J. 
Select. Areas Commun., vol. 26, no. I, pp. 118-129, Jan. 2008. 

[8] Y. Xing and C. N. Mathur and M. A. Haleem and R. Chandramouh 
and K. P. Subbalakshmi, "Dynamic spectrum access with QoS and 
interference temperature constraints," IEEE Trans. Mobile Comput., 
vol. 6, no. 4, pp. 423^33, Apr. 2007. 

[9] J. Mitola, "Cognitive radio: an integrated agent ai'chitecture for software- 
defined radio," Doctor of Technology, Royal Inst. Technol. (KTH), 
Stockholm, Sweden, 2000. 
[10] R. Zhang, "Optimal power control over fading cognitive radio channel 
by exploiting primary user csi," in IEEE Global Telecommunications 
Conference, 2008. IEEE GLOBECOM 2008., 30 Nov. - Dec. 4 2008, 
pp. 1 -5. 

[11] P. Popovski, H. Yomo, K. Nishimoii, R. D. Taranto, and R. Prasad, 
"Opportunistic interference cancellation in cognitive radio systems," 
in 2nd IEEE International Symposium on New Frontiers in Dynamic 
Spectrum Access Networks. 2007. DySPAN 2007., Apr. 2007, pp. 472 
^75. 

[12] M. Gastpar, "On capacity under receive and spatial spectrum-sharing 
constraints," IEEE Transactions on Information Theory, vol. 53, no. 2, 
pp. 471^87, Feb. 2007. 

[13] M. Levorato, U. Mitra, and M. Zorzi, "On optimal control of wireless 
networks with multiuser detection, hybrid ARQ and distortion con- 
straints," in Proc. of the 28th IEEE Conference on Computer Commu- 
nications {IEEE INFOCOM), Rio de Janeiro, Brazil, Apr 2009. 

[14] R. A. Tannious and A. Nosratinia, "Cognitive radio protocols based on 
exploiting hybrid arq retransmissions," IEEE Transactions on Wireless 
Communications, vol. 9, no. 9, pp. 2833 -2841, Sept. 2010. 



Marco Levorato (S06, M09) obtained both the BE (Electronics and Telecom- 
munications Engineer- ing) and the ME (Telecommunications Engineering) 
summa cum laude from the University of Ferrara, Italy, in 2002 and 2005, 
respectively. In 2009, he received a Ph.D. in Information Engineering from 
the University of Padova. During 2008 he was on leave at the University 
of Southern California, Los Angeles, United States. In 2009 he was a post 
doctorate researcher at the University of Padova. Since January 2010, he is a 
post doctorate researcher at Stanford and the University of Southern California 
(USC). 



Urbashi Mitra (F'07) Urbashi Mitra received the B.S. and the M.S. degrees 
from the University of California at Berkeley in 1987 and 1989 respectively, 
both in Electrical Engineering and Computer Science. From 1989 until 1990 
she worked as a Member of Technical Staff at Bellcore in Red Bank, NJ. 
In 1994, she received her Ph.D. from Princeton University in Electrical 
Engineering. From 1994 to 2000, Dr. Mitra was a member of the faculty 
of the Department of Electrical Engineering at The Ohio State University, 
Columbus, Ohio. In 2001, she joined the Department of Electrical Engineering 
at the University of Southern California, Los Angeles, where she is currently 
a Professor. Dr Mitra has been an Associate Editor for the following 
IEEE publications: Transactions on Information Theory (2007-2011), Journal 
of Oceanic Engineering (2006-2011), and Transactions on Communications 
(1996-2001). Dr Mitra served two terms as a member of the IEEE Information 
Theory Society's Board of Governors (2002-2007). She is the recipient 
of: use Center for Excellence in Research Fellowship (2010-2013), Best 
Applications Paper Award 2009 International Conference on Distributed 
Computing in Sensor Systems, the Viterbi School of Engineering Deans 
Faculty Service Award (2009), USC Mellon Mentoring Award (2008), IEEE 
Fellow (2007), Texas Instruments Visiting Professor (Fall 2002, Rice Uni- 
versity), 2001 Okawa Foundation Award, 2000 Lumley Award for Research 
(OSU College of Engineering), 1997 MacQuigg Award for Teaching (OSU 
College of Engineering), 1 996 National Science Foundation (NSF) CAREER 
Award. She has co-chaired: (technical program) 2012 International Conference 
on Signal Processing and Communications, Bangalore India, (general) first 
ACM Workshop on Underwater Networks at Mobicom 2006, Los Angeles, CA 
and the (technical) IEEE Communication Theory Symposium at ICC 2003 in 
Anchorage, AK. Dr. Mitra was the tutoiials Chair for IEEE ISIT 2007 in Nice, 
France and the Finance Chair for IEEE ICASSP 2008 in Las Vegas, NV. Dr 
Mitra has held visiting appointments at: the Delft University of Technology, 
Stanford University, Rice University, and the Eurecom Institute. She served 
as co-Director of the Communication Sciences Institute at the University of 
Southern California from 2004-2007. 



Michele Zorzi (F'07) was born in Venice, Italy, in 1966. He received 
his Laurea degree and Ph.D. in electrical engineering from the University 
of Padova, Italy, in 1990 and 1994, respectively. During academic year 
1992/93, he was on leave at the University of California, San Diego (UCSD) 
attending graduate courses and doing research on multiple access in mobile 
radio networks. In 1993, he joined the faculty of the Dipartimento di 
Elettronica e Informazione, Politecnico di Milano, Italy. After spending three 
years with the Center for Wireless Communications at UCSD, in 1998 he 
joined the School of Engineering of the University of Ferrara, Italy, and in 
2003 joined the Department of Information Engineering of the University 
of Padova, Italy, where he is currently a Professor. His present research 
interests include performance evaluation in mobile communications systems, 
random access in mobile radio networks, ad hoc and sensor networks, energy 
constrained communications protocols, cognitive networks, and underwater 
communications and networking. 

He was Editor-in-Chief of the IEEE Wireless Communications mag- 
azine from 2003 to 2005 and Editor-in-Chief of the IEEE TRANSACTIONS 
ON Communications from 2008 to 2011, and serves on the Editorial 
Board of the WILEY JOURNAL OF WIRELESS Communications and 
Mobile Computing. He was also guest editor for .special issues in IEEE 
Personal Communications and IEEE Journal on Selected Areas 
IN Communications. He served as a Member-at-Large of the Board of 
Governors of the IEEE Communications Society from 2009 to 2011. 



